Show HN: Llama-dl – high-speed download of LLaMA, Facebook's 65B GPT model

(github.com)

343 points sillysaurusx | 1 comments | 05 Mar 23 04:28 UTC | HN request time: 0.199s | source

Show context

v64 ◴[05 Mar 23 11:44 UTC] No.35028738[source]▶

If anyone is interested in running this at home, please follow the llama-int8 project [1]. LLM.int8() is a recent development allowing LLMs to run in half the memory without loss of performance [2]. Note that at the end of [2]'s abstract, the authors state "This result makes such models much more accessible, for example making it possible to use OPT-175B/BLOOM on a single server with consumer GPUs. We open-source our software." I'm very thankful we have researchers like this further democratizing access to this data and prying it out of the hands of the gatekeepers who wish to monetize it.

[1] https://github.com/tloen/llama-int8

[2] https://arxiv.org/abs/2208.07339

replies(5): >>35028950 #>>35029068 #>>35029601 #>>35030214 #>>35030868 #

causality0 ◴[05 Mar 23 16:10 UTC] No.35030868[source]▶

>>35028738 #

I feel like we're less than a decade away from being able to hook LLMs into gaming. How incredible would it be to have NPCs driven by LLM?

replies(5): >>35031124 #>>35031255 #>>35033211 #>>35034447 #>>35058462 #

1. pixl97 ◴[05 Mar 23 21:55 UTC] No.35034447[source]▶

>>35030868 #

Honestly I don't think it would be completely impossible now in a limited fashion.

Imagine playing a level and doing some particular feats in it. They get presented to GPT with a prompt and the story gets send to a AI voice model in game where the NPC asks/tells the player character about it.

↑