Show HN: Min.js style compression of tech docs for LLM context

(github.com)

177 points marv1nnnnn | 1 comments | 15 May 25 13:40 UTC | HN request time: 0.223s | source

Show context

iandanforth ◴[15 May 25 15:08 UTC] No.43995844[source]▶

I applaud this effort, however the "Does it work?" section answers the wrong question. Anyone can write a trivial doc compressor and show a graph saying "The compressed version is smaller!"

For this to "work" you need to have a metric that shows that AIs perform as well, or nearly as well, as with the uncompressed documentation on a wide range of tasks.

replies(5): >>43996061 #>>43996217 #>>43996319 #>>43996840 #>>44003395 #

marv1nnnnn ◴[15 May 25 15:29 UTC] No.43996061[source]▶

>>43995844 #

I totally agreed with your critic. To be honest, it's even hard for myself to evaluate. What I do is select several packages that current LLM failed to handle, which are in the sample folder, `crawl4ai`, `google-genai` and `svelte`. And try some tricky prompt to see if it works. But even that evaluation is hard. LLM could hallucinate. I would say most time it works, but there are always few runs that failed to deliver. I actually prepared a comparison, cursor vs cursor + internet vs cursor + context7 vs cursor + llm-min.txt. But I thought it was stochastic, so I didn't put it here. Will consider add to repo as well

replies(5): >>43996846 #>>43997120 #>>43997327 #>>44002248 #>>44002383 #

1. ricardobeat ◴[15 May 25 16:48 UTC] No.43996846[source]▶

>>43996061 #

> But even that evaluation is hard. LLM could hallucinate. I would say most time it works, but there are always few runs that failed to deliver

You can use success rate % over N runs for a set of problems, which is something you can compare to other systems. A separate model does the evaluation. There are existing frameworks like DeepEval that facilitate this.

↑