←back to thread

DeepSeek-v3.1

(api-docs.deepseek.com)
776 points wertyk | 4 comments | | HN request time: 0.017s | source
Show context
seunosewa ◴[] No.44977398[source]
It's a hybrid reasoning model. It's good with tool calls and doesn't think too much about everything, but it regularly uses outdated tool formats randomly instead of the standard JSON format. I guess the V3 training set has a lot of those.
replies(2): >>44977985 #>>44978742 #
ivape ◴[] No.44977985[source]
What formats? I thought the very schema of json is what allows these LLMs to enforce structured outputs at the decoder level? I guess you can do it with any format, but why stray from json?
replies(2): >>44978153 #>>44979158 #
1. seunosewa ◴[] No.44978153[source]
Sometimes it will randomly generate something like this in the body of the text: ``` <tool_call>executeshell <arg_key>command</arg_key> <arg_value>echo "" >> novels/AI_Voodoo_Romance/chapter-1-a-new-dawn.txt</arg_value> </tool_call> ```

or this: ``` <|toolcallsbegin|><|toolcallbegin|>executeshell<|toolsep|>{"command": "pwd && ls -la"}<|toolcallend|><|toolcallsend|> ```

Prompting it to use the right format doesn't seem to work. Claude, Gemini, GPT5, and GLM 4.5, don't do that. To accomodate DeepSeek, the tiny agent that I'm building will have to support all the weird formats.

replies(3): >>44978260 #>>44983225 #>>44983486 #
2. ◴[] No.44978260[source]
3. irthomasthomas ◴[] No.44983225[source]
Can't you use logit bias to help with this? Might depend how they are tokenized.
4. ilaksh ◴[] No.44983486[source]
Maybe you have your temperature turned up too high.