Document Intelligence

Document Chat (RAG)

Upload PDFs, ingest them into a local vector store, and chat with your documents using retrieval-augmented generation.

How RAG Works

Retrieval-Augmented Generation (RAG) combines the power of semantic search with LLM reasoning. Documents are chunked, embedded, and stored in a vector database. When you ask a question, the system retrieves the most relevant chunks and uses them to ground the AI's response.

PDF Ingestion

Upload and parse PDF documents with text extraction.

Semantic Chunking

Break documents into meaningful segments with overlap.

Vector Storage

Embed chunks using OpenAI or local models and store in MongoDB.

Hybrid Search

Combine keyword and semantic search for best retrieval.

Document Ingestion API

Use the REST API to upload documents and trigger the ingestion pipeline.

curl
curl -X POST http://localhost:5000/api/v1/documents/upload \
  -F "file=@research_paper.pdf" \
  -F "metadata={\"title\":\"AI Research\",\"tags\":[\"ai\",\"ml\"]}"

# Response:
{
  "document_id": "doc_abc123",
  "chunks_created": 45,
  "status": "indexed"
}

Querying Documents

Ask natural language questions and get answers grounded in your uploaded documents with source citations.

document-query.json
{
  "query": "What are the key findings about neural networks?",
  "document_ids": ["doc_abc123"],
  "top_k": 5
}

# Response:
{
  "answer": "The key findings indicate that...",
  "sources": [
    { "chunk_id": "chunk_12", "page": 3, "relevance": 0.92 },
    { "chunk_id": "chunk_45", "page": 8, "relevance": 0.88 }
  ]
}

Privacy & Local Storage

All document embeddings and vector data are stored locally in MongoDB. No data is sent to external services unless you explicitly configure an external embedding API. You can use local embedding models for complete offline operation.