(ggml.ai)

899 points georgehill | 2 comments | 06 Jun 23 16:50 UTC | HN request time: 0.59s | source

Show context

world2vec ◴[06 Jun 23 17:25 UTC] No.36216161[source]▶

>>36215651 (OP) #

Might be a silly question but is GGML a similar/competing library to George Hotz's tinygrad [0]?

[0] https://github.com/geohot/tinygrad

replies(2): >>36216187 #>>36218539 #

qeternity ◴[06 Jun 23 17:27 UTC] No.36216187[source]▶

>>36216161 #

No, GGML is a CPU optimized library and quantized weight format that is closely linked to his other project llama.cpp

replies(2): >>36216244 #>>36216266 #

stri8ed ◴[06 Jun 23 17:30 UTC] No.36216244[source]▶

>>36216187 #

How does the quantization happen? Are the weights preprocessed before loading the model?

replies(2): >>36216303 #>>36216321 #

1. sebzim4500 ◴[06 Jun 23 17:35 UTC] No.36216303[source]▶

>>36216244 #

Yes, but to my knowledge it doesn't do any of the complicated optimization stuff that SOTA quantisation methods use. It basically is just doing a bunch of rounding.

There are advantages to simplicity, after all.

replies(1): >>36216416 #

2. brucethemoose2 ◴[06 Jun 23 17:42 UTC] No.36216416[source]▶

>>36216303 (TP) #

Its not so simple anymore, see https://github.com/ggerganov/llama.cpp/pull/1684

↑

GGML – AI at the Edge