Most active commenters

Le Chat: Custom MCP Connectors, Memories

(mistral.ai)

Show context

barrell ◴[04 Sep 25 11:36 UTC] No.45126116[source]▶

I recently upgraded a large portion of my pipeline from gpt-4.1-mini to gpt-5-mini. The performance was horrible - after some research I decided to move everything to mistral-medium-0525.

Same price, but dramatically better results, way more reliable, and 10x faster. The only downside is when it does fail, it seems to fail much harder. Where gpt-5-mini would disregard the formatting in the prompt 70% of the time, mistral-medium follows it 99% of the time, but the other 1% of the time inserts random characters (for whatever reason, normally backticks... which then causes it's own formatting issues).

Still, very happy with Mistral so far!

replies(11): >>45126199 #>>45126266 #>>45126479 #>>45126528 #>>45126707 #>>45126741 #>>45126840 #>>45127790 #>>45129028 #>>45130298 #>>45136002 #

1. mark_l_watson ◴[04 Sep 25 11:55 UTC] No.45126266[source]▶

>>45126116 #

It is such a common pattern for LLMs to surround generated JSON with ```json … ``` that I check for this at the application level and fix it. Ten years ago I would do the same sort of sanity checks on formatting when I used LSTMs to generate synthetic data.

replies(9): >>45126463 #>>45126482 #>>45126489 #>>45126578 #>>45127374 #>>45127884 #>>45127900 #>>45128015 #>>45128042 #

2. Alifatisk ◴[04 Sep 25 12:25 UTC] No.45126463[source]▶

>>45126266 (TP) #

I think this is the first time I stumped upon someone who actually mentions LSTM in a practical way instead of just theory. Cool!

Would you like to elaborate further on how the experience was with it? What was your approach for using it? How did you generate synthetic data? How did it perform?

replies(1): >>45127590 #

3. barrell ◴[04 Sep 25 12:27 UTC] No.45126482[source]▶

>>45126266 (TP) #

Yeah, that’s infuriating. They’re getting better now with structured data, but it’s going to be a never ending battle getting reliable data structures from an LLM.

This is maybe more maybe less insidious. It will literally just insert a random character into the middle of a word.

I work with an app that supports 120+ languages though. I give the LLM translations, transliterations, grammar features etc and ask it to explain it in plain English. So it’s constantly switching between multiple real, and sometimes fake (transliterations) languages. I don’t think most users would experience this

4. Alifatisk ◴[04 Sep 25 12:28 UTC] No.45126489[source]▶

>>45126266 (TP) #

I do use backticks a lot when sharing examples in different format when using LLMs and I have instructed them to do likewise, I also upvote whenever they respond in that matter.

I got this format from writing markdown files, it’s a nice way to share examples and also specify which format it is.

5. viridian ◴[04 Sep 25 12:40 UTC] No.45126578[source]▶

>>45126266 (TP) #

I'm sure the reason is the plethora of markdown data is was trained on. I personally use ``` stuff.txt ``` extremely frequently, in a variety of places.

In slack/teams I do it with anything someone might copy and paste to ensure that the chat client doesn't do something horrendous like replace my ascii double quotes with the fancy unicode ones that cause syntax errors.

In readme files any example path, code, yaml, or json is wrapped in code quotes.

In my personal (text file) notes I also use ``` {} ``` to denote a code block I'd like to remember, just out of habit from the other two above.

replies(1): >>45127290 #

6. accrual ◴[04 Sep 25 13:52 UTC] No.45127290[source]▶

>>45126578 #

Same. For me it's almost like a symbiotic thing to me. After using LLMs for a couple of years I noticed I use code blocks/backticks a lot more often. It's helpful for me as an inline signal like "this is a function name or hostname or special keyword" but it's also helpful for other people/Teams/Slack and LLMs alike.

replies(1): >>45128349 #

7. mejutoco ◴[04 Sep 25 14:00 UTC] No.45127374[source]▶

>>45126266 (TP) #

Funny, I do the same. Additionally, one can define a json schema for the output and try to load the response as json or retry for a number of times. If it is not valid json or the schema is not followed we discard it and retry.

It also helps with having a field of the json be the confidence or a similar pattern to act as a cut for what response is accepted.

8. p1esk ◴[04 Sep 25 14:21 UTC] No.45127590[source]▶

>>45126463 #

10 years ago I used LSTMs for music generation. Worked pretty well for short MIDI snippets (30-60 seconds).

9. fumeux_fume ◴[04 Sep 25 14:45 UTC] No.45127884[source]▶

>>45126266 (TP) #

Very common struggle, but a great way to prevent that is prefilling the assistant response with "{" or as much JSON output as you're going to know ahead of time like '{"response": ['

replies(2): >>45128284 #>>45128591 #

10. freehorse ◴[04 Sep 25 14:46 UTC] No.45127900[source]▶

>>45126266 (TP) #

I had similar issues with local models, ended up actually requesting the backticks because it was easier this way, and parsed the output accordingly. I cached a prompt with explicit examples how to structure data, and reused this over and over. I have found that without examples in the prompts some llms are very unreliable, but with caching some example prompts this becomes a non-issue.

11. tosh ◴[04 Sep 25 14:56 UTC] No.45128015[source]▶

>>45126266 (TP) #

I think most mainstream APIs by now have a way for you to conform the generated answer to a schema.

12. mpartel ◴[04 Sep 25 14:58 UTC] No.45128042[source]▶

>>45126266 (TP) #

Some LLM APIs let you give a schema or regex for the answer. I think it works because LLMs give a probability for every possible next token, and you can filter that list by what the schema/regex allows next.

replies(1): >>45128098 #

13. hansvm ◴[04 Sep 25 15:03 UTC] No.45128098[source]▶

>>45128042 #

Interestingly, that gives a different response distribution from simply regenerating while the output doesn't match the schema.

replies(2): >>45130238 #>>45131168 #

14. psadri ◴[04 Sep 25 15:20 UTC] No.45128284[source]▶

>>45127884 #

Haven’t tried this. Does it mix well with tool calls? Or does it force a response where you might have expected a tool call?

replies(1): >>45129068 #

15. OJFord ◴[04 Sep 25 15:26 UTC] No.45128349{3}[source]▶

>>45127290 #

I'm the opposite, always been pretty good about doing that in Slack etc. (or even here where it doesn't affect the rendering) but I sometimes don't bother in LLM chat.

16. XenophileJKO ◴[04 Sep 25 15:47 UTC] No.45128591[source]▶

>>45127884 #

Just to be clear for anyone reading this, the optimal way to do this is schema enforced inference. You can only get a parsable response. There are failure modes, but you don't have to mess with parsing at all.

17. fumeux_fume ◴[04 Sep 25 16:30 UTC] No.45129068{3}[source]▶

>>45128284 #

It'll force a response that begins with an open bracket. So if you might need a response with a tool call that doesn't start with "{", then it might not fit your workflow.

18. joshred ◴[04 Sep 25 18:04 UTC] No.45130238{3}[source]▶

>>45128098 #

It sounds like they are describing a regex filter being applied to the model's beam search. LLMs generate the most probable words, but they are frequently tracking several candidate phrases at a time and revising their combined probability. It lets them self correct if a high probability word leads to a low probability phrase.

I think they are saying that if highest probability phrase fails the regex, the LLM is able to substitute the next most likely candidate.

replies(1): >>45132697 #

19. Rudybega ◴[04 Sep 25 19:18 UTC] No.45131168{3}[source]▶

>>45128098 #

This is true, but there are methods to greatly reduce the effect of this and generate results that match or even improve overall output accuracy:

e.g. DOMINO https://arxiv.org/html/2403.06988v1

20. stavros ◴[04 Sep 25 21:59 UTC] No.45132697{4}[source]▶

>>45130238 #

You're actually applying a grammar to the token. If you're outputting, for example, JSON, you know what characters are valid next (because of the grammar), so you just filter out the tokens that don't fit the grammar.

↑