(github.com)

343 points sillysaurusx | 3 comments | 05 Mar 23 04:28 UTC | HN request time: 0.636s | source

Show context

linearalgebra45 ◴[05 Mar 23 11:25 UTC] No.35028638[source]▶

It's been enough time since this leaked, so my question is why aren't there blog posts already of people blowing their $300 of starter credit with ${cloud_provider} on a few hours' experimentation running inference on this 65B model?

Edit: I read the linked README.

> I was impatient and curious to try to run 65B on an 8xA100 cluster

Well?

replies(2): >>35028936 #>>35030027 #

1. ulnarkressty ◴[05 Mar 23 14:49 UTC] No.35030027[source]▶

>>35028638 #

https://medium.com/@enryu9000/mini-post-first-look-at-llama-...

*later edit - not the 65G model, but the smaller ones. Performance seems mixed at first glance, not really competitive with ChatGPT fwiw.

replies(2): >>35030082 #>>35031470 #

2. linearalgebra45 ◴[05 Mar 23 14:55 UTC] No.35030082[source]▶

>>35030027 (TP) #

> not the 65G model, but the smaller ones

Haha, that's right! I saw that one too

3. minxomat ◴[05 Mar 23 17:03 UTC] No.35031470[source]▶

>>35030027 (TP) #

> not really competitive with ChatGPT

That's impossible to judge. LLama is a foundational model. It has received neither instructional fine tuning (davinci-3) nor RLHF (ChatGPT). It cannot be compared to these finetuned models without, well, finetuning.

↑

Show HN: Llama-dl – high-speed download of LLaMA, Facebook's 65B GPT model