←back to thread

343 points sillysaurusx | 2 comments | | HN request time: 0.427s | source
Show context
linearalgebra45 ◴[] No.35028638[source]
It's been enough time since this leaked, so my question is why aren't there blog posts already of people blowing their $300 of starter credit with ${cloud_provider} on a few hours' experimentation running inference on this 65B model?

Edit: I read the linked README.

> I was impatient and curious to try to run 65B on an 8xA100 cluster

Well?

replies(2): >>35028936 #>>35030027 #
v64 ◴[] No.35028936[source]
The compute necessary to run 65B naively was only available on AWS (and perhaps Azure, I don't work with them) and the required instance types have been unavailable to the public recently (it seems everyone had the same idea to hop on this and try to run it). In my other post here [1], the memory requirements have been lowered through other work, and it should now be possible to run the 65B on a provider like CoreWeave.

[1] https://news.ycombinator.com/item?id=35028738

replies(2): >>35029106 #>>35029766 #
linearalgebra45 ◴[] No.35029106[source]
Are you sure about that? I can't remember where I saw the table of memory requirements, but I'm sure some of the larger instances here [1] will surely be able to cope (assuming they're available!)

Oracle gives you a $300 free trial, which equates to running BM.GPU4.8 for over 10 hours - enough for a focused day of prompting

[1] https://www.oracle.com/cloud/compute/gpu/

replies(3): >>35029110 #>>35030261 #>>35034167 #
smoldesu ◴[] No.35030261[source]
Thanks for sharing it! I'm using their "Always Free" tier to host an Ampere-accelerated GPT-J chatbot right now. Works like a charm, and best of all, it's free!
replies(2): >>35030667 #>>35031376 #
1. damascus ◴[] No.35031376[source]
Do you have any code from your discord bot you're willing to share? I'd be happy to share back any updates I made to it. I've been wanting to play with this idea for a bit.
replies(1): >>35032653 #
2. ◴[] No.35032653[source]