Zamba2-7B | slacker news

If you look in the `config.json`[1] it shows `Zamba2ForCausalLM`. You can use a version of the transformers library to do inference that supports that.

The model card states that you have to use their fork of transformers.[2]

1. https://huggingface.co/Zyphra/Zamba2-7B-Instruct/blob/main/c...

2. https://huggingface.co/Zyphra/Zamba2-7B-Instruct#prerequisit...

7. x_may ◴[15 Oct 24 11:10 UTC] No.41847312{4}[source]▶

>>41844520 #

As another commenter said, this has no GGUF because it’s partially mamba based which is unsupported in llama.cpp

8. Havoc ◴[15 Oct 24 12:01 UTC] No.41847638[source]▶

>>41844057 #

Mamba based stuff tends to take longer to become available

9. wazoox ◴[15 Oct 24 12:09 UTC] No.41847696{3}[source]▶

>>41844163 #

Gpt4all is a good and easy way to run gguf models.

10. xyc ◴[15 Oct 24 21:08 UTC] No.41853147{4}[source]▶

>>41844520 #

dev of https://recurse.chat/ here, thanks for mentioning! rn we are focusing on features like shortcuts/floating window, but will look into support this in some time. to add to the llama.cpp support discussion, it's also worth noting that llama.cpp does not yet support gpu for mamba models https://github.com/ggerganov/llama.cpp/issues/6758