←back to thread

152 points fzliu | 1 comments | | HN request time: 0.217s | source
Show context
bigdict ◴[] No.43562732[source]
Sure, you can get better model performance by throwing more compute at the problem in different places. Does is it improve perf on an isoflop basis?
replies(4): >>43562773 #>>43563245 #>>43563544 #>>43564050 #
1. Reubend ◴[] No.43563544[source]
It's a valid criticism that this method would increase compute requirements, but sometimes an improvement in the end result justifies the compute needed. For things like code generation in large datasets, many people would be willing to "pay" with more compute if the results were better. And this doesn't seem to require more memory bandwidth, so it could be particularly good for local models.