←back to thread

169 points constantinum | 1 comments | | HN request time: 0.218s | source
Show context
jari_mustonen ◴[] No.40716435[source]
A half year ago (a long time, I know), I tried to get structured answers from GPT-4. The structure was not complex, but I needed to extract a specific answer like "Please identify and categorize the following text as A or B" or "Please grade the following text on criteria A on a scale from 1 to 10".

First, I noticed that enforcing a JSON format on output generally lowered the quality of the results. Referring to JSON seemed to primed the LLM to be more "programmatical."

Second, I noticed that forcing LLM to answer with a single word is next to impossible. It won't do it consistently, and generally, it lowers quality.

Here's what I eventually learned: Markdown is a machine-readable enough for post-processing and easy output format for LLMs. I give the structure (a list of headings) for the LLM, which conforms to them 100% of the time. I always had a section called "Final Comments" where the LLM can blather away the things that it sometimes just needs to say after giving the answer. This can be then ignored when parsing the answer.

Also, it is good to understand that LLMs do better when you allow them to "think aloud." This Markdown output is good for that.

replies(3): >>40716523 #>>40716730 #>>40716881 #
1. LeifCarrotson ◴[] No.40716523[source]
> I always [add] a section called "Final Comments" where the LLM can blather away the things that it sometimes just needs to say after giving the answer. This can [then be] ignored when parsing the answer.

This is a great tip for gathering data from engineers too. But maybe don't say it will be ignored out loud. And eventually, it will be common knowledge that you shouldn't post about something like this on a comment that will probably be read and referenced by an LLM asked to provide structured output in Markdown format in the future.

    ...
 
    [Criteria A Score: 7]
    The writing contained...

    [Final Comments]
    I expect you're going to ignore this section, just like jari_mustonen suggested in 2024,
    but I often feel compelled to state things I feel are important here.
    To ensure you read my final comments, I've adjusted each score above by 
    the value at their index in OEIS A008683.