Lesson retrieval — chunking experiments

andy/embeddings·Tidepool Labs/ researchNot connected

andy/embeddings·demo · autoplaying

cycle 0

◊

I'll build a tiny eval: embed both chunkings, run the same 20 queries, score recall@5 against the labeled-relevant lessons. Reuse the existing embeddings client, or stub it?

@andy

Reuse the client, but cache embeddings to disk so reruns are free.

Writescripts/chunk-eval.ts (+96)allowed

Bashtsx scripts/chunk-eval.ts --strategy fixed,semanticallowed

◊

Results: fixed-512 recall@5 = 0.61, semantic = 0.74. Semantic wins mostly on multi-paragraph lessons where the fixed split cut a pattern in half. Cost is ~1.3× more chunks. Want the per-query breakdown?

@andy

Yes — show me where fixed lost.

◊

Fixed missed 6/20 — all lessons over ~700 tokens where the key 'why' paragraph landed in a different chunk from the title. Semantic kept those together. I'd ship semantic with a 1,000-token cap.

Session went offline — Relay daemon disconnected. Transcript preserved.

@andy

Testing chunking for lesson retrieval. Compare fixed 512-token chunks vs a semantic split on recall@5 over our lessons corpus.

demo · read-only

Composer is disabled in the demo. In the real app, this is where you'd type a prompt — `@@bob/billing` to route, plain text to queue. Sign in to use it.