(www.sergey.fyi)

1303 points serjester | 2 comments | 05 Feb 25 18:05 UTC | HN request time: 0s | source

Show context

Havoc ◴[05 Feb 25 19:03 UTC] No.42953438[source]▶

Been toying with the flash model. Not the top model, but think it'll see plenty use due to the details. Wins on things other than top of benchmark logs

* Generous free tier

* Huge context window

* Lite version feels basically instant

However

* Lite model seems more prone to repeating itself / looping

* Very confusing naming e.g. {model}-latest worked for 1.5 but now its {model}-001? The lite has a date appended, the non-lite does not. Then there is exp and thinking exp...which has a date. wut?

replies(1): >>42953462 #

ai-christianson ◴[05 Feb 25 19:05 UTC] No.42953462[source]▶

>>42953438 #

> * Huge context window

But how well does it actually handle that context window? E.g. a lot of models support 200K context, but the LLM can only really work with ~80K or so of it before it starts to get confused.

replies(5): >>42953514 #>>42953536 #>>42953554 #>>42953762 #>>42955202 #

1. asadm ◴[05 Feb 25 19:11 UTC] No.42953554[source]▶

>>42953462 #

it works REALLY well. I have used it to dump many references codes and then help me write a new modules etc. I have gone up to 200k tokens I think with no problems in recall.

replies(1): >>42953640 #

2. ai-christianson ◴[05 Feb 25 19:17 UTC] No.42953640[source]▶

>>42953554 (TP) #

Awesome. Models that can usefully leverage such large context windows are rare at this point.

Something like this opens up a lot of use cases.

↑

Ingesting PDFs and why Gemini 2.0 changes everything