←back to thread

634 points david927 | 1 comments | | HN request time: 0.207s | source

What are you working on? Any new ideas that you're thinking about?
1. iamwil ◴[] No.41343571[source]
I just published the first issue our digital zine, Forest Friends. The first issue is on "LLM System Evals in the Wild".

Lots of AI engineerers are doing vibes-based engineering, just eyeballing the LLM output and saying "LGTM!". This is a good place to start, as we all should look at our data more. But it's best to move on from vibes to system evals.

The first issue is on how to design and build system evals for a systematic way to gauge how well your LLM app is doing. That way, no matter if there are new models, new users, or new queries, you can be sure you're continuously improving, rather than allowing regressions.

You can buy the first issue here:

https://issue1.forestfriends.tech/

And if you want to keep abreast of the next issue, you can subscribe here:

https://forestfriends.tech