←back to thread

923 points zh2408 | 2 comments | | HN request time: 0.407s | source
Show context
fforflo ◴[] No.43744642[source]
If you want to use Ollama to run local models, here’s a simple example:

from ollama import chat, ChatResponse

def call_llm(prompt, use_cache: bool = True, model="phi4") -> str: response: ChatResponse = chat( model=model, messages=[{ 'role': 'user', 'content': prompt, }] ) return response.message.content

replies(1): >>43744660 #
1. mooreds ◴[] No.43744660[source]
Is the output as good?

I'd love the ability to run the LLM locally, as that would make it easier to run on non public code.

replies(1): >>43744828 #
2. fforflo ◴[] No.43744828[source]
It's decent enough. But you'd probably have to use a model like llama2, which may set your GPU on fire.