DeepSeek-v3.1

(api-docs.deepseek.com)

776 points wertyk | 4 comments | 21 Aug 25 19:06 UTC | HN request time: 0.017s | source

Show context

seunosewa ◴[21 Aug 25 20:05 UTC] No.44977398[source]▶

It's a hybrid reasoning model. It's good with tool calls and doesn't think too much about everything, but it regularly uses outdated tool formats randomly instead of the standard JSON format. I guess the V3 training set has a lot of those.

replies(2): >>44977985 #>>44978742 #

ivape ◴[21 Aug 25 20:56 UTC] No.44977985[source]▶

>>44977398 #

What formats? I thought the very schema of json is what allows these LLMs to enforce structured outputs at the decoder level? I guess you can do it with any format, but why stray from json?

replies(2): >>44978153 #>>44979158 #

1. seunosewa ◴[21 Aug 25 21:11 UTC] No.44978153[source]▶

>>44977985 #

Sometimes it will randomly generate something like this in the body of the text: ``` <tool_call>executeshell <arg_key>command</arg_key> <arg_value>echo "" >> novels/AI_Voodoo_Romance/chapter-1-a-new-dawn.txt</arg_value> </tool_call> ```

or this: ``` <｜toolcallsbegin｜><｜toolcallbegin｜>executeshell<｜toolsep｜>{"command": "pwd && ls -la"}<｜toolcallend｜><｜toolcallsend｜> ```

Prompting it to use the right format doesn't seem to work. Claude, Gemini, GPT5, and GLM 4.5, don't do that. To accomodate DeepSeek, the tiny agent that I'm building will have to support all the weird formats.

replies(3): >>44978260 #>>44983225 #>>44983486 #

2. ◴[21 Aug 25 21:23 UTC] No.44978260[source]▶

>>44978153 (TP) #

3. irthomasthomas ◴[22 Aug 25 11:32 UTC] No.44983225[source]▶

>>44978153 (TP) #

Can't you use logit bias to help with this? Might depend how they are tokenized.

4. ilaksh ◴[22 Aug 25 12:04 UTC] No.44983486[source]▶

>>44978153 (TP) #

Maybe you have your temperature turned up too high.

↑