(mistral.ai)

701 points mfiguiere | 1 comments | 21 May 25 14:21 UTC | HN request time: 0.259s | source

Show context

simonw ◴[21 May 25 17:30 UTC] No.44053886[source]▶

The first number I look at these days is the file size via Ollama, which for this model is 14GB https://ollama.com/library/devstral/tags

I find that on my M2 Mac that number is a rough approximation to how much memory the model needs (usually plus about 10%) - which matters because I want to know how much RAM I will have left for running other applications.

Anything below 20GB tends not to interfere with the other stuff I'm running too much. This model looks promising!

replies(4): >>44054806 #>>44056502 #>>44059216 #>>44059888 #

lis ◴[21 May 25 18:48 UTC] No.44054806[source]▶

>>44053886 #

Yes, I agree. I've just ran the model locally and it's making a good impression. I've tested it with some ruby/rspec gotchas, which it handled nicely.

I'll give it a try with aider to test the large context as well.

replies(1): >>44055628 #

ericb ◴[21 May 25 19:56 UTC] No.44055628[source]▶

>>44054806 #

In ollama, how do you set up the larger context, and figure out what settings to use? I've yet to find a good guide. I'm also not quite sure how I should figure out what those settings should be for each model.

There's context length, but then, how does that relate to input length and output length? Should I just make the numbers match? 32k is 32k? Any pointers?

replies(2): >>44056025 #>>44058487 #

1. zackify ◴[22 May 25 03:28 UTC] No.44058487[source]▶

>>44055628 #

Ollama breaks for me. If I manually set the context higher. The next api call from clone resets it back.

And ollama keeps taking it out of memory every 4 minutes.

LM studio with MLX on Mac is performing perfectly and I can keep it in my ram indefinitely.

Ollama keep alive is broken as a new rest api call resets it after. I’m surprised it’s this glitched with longer running calls and custom context length.

↑

Devstral