(www.boundaryml.com)

169 points constantinum | 5 comments | 18 Jun 24 04:01 UTC | HN request time: 1s | source

1. alhaad ◴[18 Jun 24 06:51 UTC] No.40714792[source]▶

Are there fine tuned models that perform better for structured / parsable outputs?

2. _flux ◴[18 Jun 24 07:03 UTC] No.40714873[source]▶

This isn't the answer to that question, but llama.cpp has a feature to constrain output to the provided grammar, such as https://github.com/ggerganov/llama.cpp/blob/master/grammars/...

Others should really implement that as well. You still need to guide the model to produce e.g. JSON to get good results, but they will 100% guaranteed be valid per the grammar.

replies(1): >>40715014 #

3. cpursley ◴[18 Jun 24 07:03 UTC] No.40714875[source]▶

>>40714792 (TP) #

Fireworks.ai Firefunction is pretty good. Not GPT-level but it’s an open model.

4. alhaad ◴[18 Jun 24 07:28 UTC] No.40715014[source]▶

>>40714873 #

Agreed that others should implement it as well but coercing llama to output results with matching grammar needs work.

replies(1): >>40715063 #

5. _flux ◴[18 Jun 24 07:36 UTC] No.40715063{3}[source]▶

>>40715014 #

What kind of work? I've only given it a short try before moving to Ollama that doesn't have it, but it seemed to have worked there. (With ollama I need to use a retry system.)

edit: I researched a bit and apparently it can reduce performance, plus the streaming mode fails to report incorrect grammars. Overall these don't seem like deal-breakers.

↑

Every Way to Get Structured Output from LLMs