Embed the user query
Workers AI bge-base-en-v1.5 turns the plain-language legal situation into a 768-dimensional query vector.
@cf/baai/bge-base-en-v1.5
System Architecture
The application combines dense vector search, keyword retrieval, rank fusion, reranking, and a grounded explanation layer. The LLM explains retrieved cases; it does not decide what cases exist.
2,288
unique decisions indexed
25.9k
retrievable chunks
997
gold eval queries
Overview Flow
Query path
online request path
User query
Plain-language legal facts
Next.js UI
Search form and results view
POST /api/search
Validates request with Zod
Workers AI embedding
bge-base-en-v1.5 query vector
Hybrid retrieval
Vectorize semantic + D1 FTS5 keyword
RRF fusion
Rank-level score fusion
Case aggregation
Chunk hits collapse to cases
Workers AI reranker
Cross-encoder relevance scores
Grounded explanation
Gemini or Workers AI over retrieved cases
JSON response
Citations, excerpts, scores
Ingestion path
offline corpus path
Local corpus parquet
Downloaded judgments and metadata
sample/eval-set scripts
Select source cases and gold queries
JSONL ingest files
Batch payloads for upload
POST /api/internal/ingest
Secret-protected ingestion route
D1 storage
cases, chunks, FTS5 keyword index
Vectorize index
Chunk vectors with metadata
RAG Design
Workers AI bge-base-en-v1.5 turns the plain-language legal situation into a 768-dimensional query vector.
@cf/baai/bge-base-en-v1.5
Vectorize returns semantic chunk matches while D1 FTS5 returns keyword/BM25 matches over the same chunk table.
Vectorize top-50 + D1 FTS5 top-50
Reciprocal Rank Fusion combines semantic and keyword lists by rank instead of mixing incompatible raw scores.
RRF k=60
The system searches at chunk level for recall, then keeps the best chunk for each case so lawyers see cases, not fragments.
20 case candidates
Workers AI bge-reranker-base scores each candidate passage against the query and filters weak matches.
@cf/baai/bge-reranker-base
Gemini Flash or Workers AI receives the reranked candidate set and produces grounded JSON explanations.
No retrieval in the LLM layer
Vector Search
Each judgment is split into paragraph-aware windows of roughly 512 to 1024 tokens with overlap. Every chunk is embedded with Workers AI and stored in Vectorize with metadata filters for jurisdiction, court, year, and paragraph range.
Embedding model
@cf/baai/bge-base-en-v1.5
Dimensions
768
First-stage topK
50 semantic + 50 keyword
Candidate cap
20 cases before rerank
Storage
Case-level metadata: citation, title, court, jurisdiction, year, catchwords, source URL, and future R2 key.
Paragraph-aware judgment chunks with passage text and paragraph ranges. This also powers case detail pages.
External-content full-text index synchronized from chunks by triggers. Used for BM25 keyword retrieval.
Chunk vectors with indexed metadata for jurisdiction, court, year, and paragraph ranges. Used for semantic retrieval.
Grounding
Evaluation
The evaluation uses open-australian-legal-qa questions whose source cases are present in the deployed corpus. The internal eval endpoint skips the explanation LLM and scores retrieval rankings directly.
| Config | R@5 | R@10 | MRR | nDCG |
|---|---|---|---|---|
| Semantic only | 0.930 | 0.950 | 0.878 | 0.895 |
| Hybrid | 0.975 | 0.992 | 0.944 | 0.956 |
| Hybrid + rerank | 0.978 | 0.989 | 0.953 | 0.962 |