Really hope this goes somewhere, o1 without openai's costs and restrictions would be sweet.
replies(2):
Blog: https://medium.com/@peakji/a-small-step-towards-reproducing-...
Hugging Face: https://huggingface.co/collections/peakji/steiner-preview-67...
Im wondering if we can abstract chain of thought further down into the computation levels to replace a lot of matrix multiply. Like smaller transformers with less parameters and more selection of which transformer to use through search.