(medium.com)

83 points peakji | 1 comments | 22 Oct 24 16:07 UTC | HN request time: 0.216s | source

Steiner is a series of reasoning models trained on synthetic data using reinforcement learning. These models can explore multiple reasoning paths in an autoregressive manner during inference and autonomously verify or backtrack when necessary, enabling a linear traversal of the implicit search tree.

Blog: https://medium.com/@peakji/a-small-step-towards-reproducing-...

Hugging Face: https://huggingface.co/collections/peakji/steiner-preview-67...

Show context

Mr_Bees69 ◴[22 Oct 24 16:15 UTC] No.41915821[source]▶

>>41915735 (OP) #

Really hope this goes somewhere, o1 without openai's costs and restrictions would be sweet.

replies(2): >>41916023 #>>41916629 #

1. ActorNightly ◴[22 Oct 24 17:35 UTC] No.41916629[source]▶

>>41915821 #

OpenAIs o1 isnt really going that far though. Its definitelly better in some areas, but not overall better.

Im wondering if we can abstract chain of thought further down into the computation levels to replace a lot of matrix multiply. Like smaller transformers with less parameters and more selection of which transformer to use through search.

↑

Show HN: Steiner – An open-source reasoning model inspired by OpenAI o1