Overview#
An investigator is working through hundreds of witness statements and forensic reports for a complex fraud case. Instead of reading every document, she types a plain-English question into the case search interface and receives a cited answer drawn directly from the evidence. The RAG domain makes this possible. It ingests documents, breaks them into semantically meaningful chunks, and combines vector similarity search with keyword matching to find the most relevant passages. An LLM then synthesises those passages into a grounded answer, complete with citations linking back to the exact source pages. Hallucination detection checks that each claim in the response is traceable to the retrieved evidence.
Key Features#
- Document ingestion with semantic chunking, token counting, and metadata preservation
- Hybrid search combining vector similarity and keyword matching with result fusion
- LLM-powered question answering with context building and optimised prompts
- Citation extraction linking answers to specific source documents, pages, and excerpts
- Hallucination detection with grounding verification for answer factuality
- User feedback collection with thumbs up/down ratings for quality improvement
- Case-scoped search to focus queries on specific investigation evidence
- Re-ranking with LLM-based relevance scoring for improved result accuracy
- Response caching for repeated queries with configurable cache settings
- Support for multiple document types including witness statements, reports, transcripts, and forensic results
Use Cases#
Retrieval-augmented question answering is valuable in any field where analysts must work through large volumes of documents quickly. Relevant industries include law enforcement, legal services, and financial intelligence.
- Querying case evidence in natural language to find relevant information with cited sources
- Ingesting investigation documents for semantic search and AI-powered analysis
- Verifying answer grounding to ensure AI responses are factually supported by evidence
- Collecting analyst feedback to improve search relevance and answer quality over time
Integration#
The RAG domain connects with language model operations, evidence management, case management, analytical tools, and search infrastructure.
Open Standards#
- GraphQL (June 2018 specification): all RAG operations, document ingestion, hybrid search, citation retrieval, and feedback submission, are exposed as typed GraphQL queries and mutations, with structured error extensions and a JSON scalar for arbitrary metadata.
- JSON / NDJSON (RFC 8259 / IETF): document chunk metadata, query filters, and API responses are encoded as JSON; vector upsert payloads to the Cloudflare Vectorize REST interface use Newline-Delimited JSON (NDJSON) with
application/x-ndjsoncontent type. - OAuth 2.0 Bearer Token (RFC 6750): all outbound calls to the Cloudflare Vectorize and Workers AI APIs present credentials as
Authorization: Bearertokens, conforming to the OAuth 2.0 token usage specification. - ISO/IEC 9075 SQL, Full-Text Search: the keyword retrieval leg uses PostgreSQL's
to_tsvector/to_tsquery/ts_rankfunctions, which implement the SQL standard's built-in text-search extensions; this provides the BM25-style scoring component of the hybrid search. - UUID (RFC 4122): every document chunk, vectorise index record, and feedback entry is assigned a version-4 universally unique identifier, ensuring collision-free cross-tenant key spaces.
- ISO 8601 / RFC 3339 datetime: ingestion timestamps, cache expiry values, and answer feedback creation times are stored and serialised using ISO 8601 extended format (Python
.isoformat()), ensuring unambiguous datetime interchange.
Last Reviewed: 2026-02-05 Last Updated: 2026-04-14