←back to thread

145 points BUFU | 1 comments | | HN request time: 0.389s | source
Show context
Patrick_Devine ◴[] No.42070613[source]
This was a pretty heavy lift for us to get out which was why it took a while. In addition to writing new image processing routines, a vision encoder, and doing cross attention, we also ended up re-architecting the way the models get run by the scheduler. We'll have a technical blog post soon about all the stuff that ended up changing.
replies(3): >>42070644 #>>42071917 #>>42072723 #
1. csomar ◴[] No.42072723[source]
Any info of when we will get the 11B and 90B models?