← Back to dashboard

Lesson retrieval — chunking experiments

andy/embeddings·Tidepool Labs/ researchNot connected
andy/embeddings·demo · autoplaying
cycle 0
I'll build a tiny eval: embed both chunkings, run the same 20 queries, score recall@5 against the labeled-relevant lessons. Reuse the existing embeddings client, or stub it?
AJ
@andy
Reuse the client, but cache embeddings to disk so reruns are free.
Writescripts/chunk-eval.ts (+96)allowed
Bashtsx scripts/chunk-eval.ts --strategy fixed,semanticallowed
Results: fixed-512 recall@5 = 0.61, semantic = 0.74. Semantic wins mostly on multi-paragraph lessons where the fixed split cut a pattern in half. Cost is ~1.3× more chunks. Want the per-query breakdown?
AJ
@andy
Yes — show me where fixed lost.
Fixed missed 6/20 — all lessons over ~700 tokens where the key 'why' paragraph landed in a different chunk from the title. Semantic kept those together. I'd ship semantic with a 1,000-token cap.
Session went offline — Relay daemon disconnected. Transcript preserved.
AJ
@andy
Testing chunking for lesson retrieval. Compare fixed 512-token chunks vs a semantic split on recall@5 over our lessons corpus.
demo · read-only

Composer is disabled in the demo. In the real app, this is where you'd type a prompt — `@@bob/billing` to route, plain text to queue. Sign in to use it.

Sign in →