Claude Sonnet will ship in Xcode

(developer.apple.com)

Show context

throwawa14223 ◴[29 Aug 25 02:08 UTC] No.45059251[source]▶

It's getting harder to find IDEs that properly boycott LLMs.

replies(12): >>45059290 #>>45059291 #>>45059362 #>>45059390 #>>45059429 #>>45059473 #>>45059478 #>>45059506 #>>45059650 #>>45060008 #>>45061090 #>>45061442 #

jama211 ◴[29 Aug 25 02:59 UTC] No.45059650[source]▶

>>45059251 #

“Boycott” is a pretty strong term. I’m sensing a strong dislike of ai from you which is fine but if you dislike a feature most people like it shouldn’t be surprising to you that you’ll find yourself mostly catered to by more niche editors.

replies(1): >>45059762 #

1. isodev ◴[29 Aug 25 03:17 UTC] No.45059762[source]▶

>>45059650 #

I think it's a pretty good word, let's not forget how LLMs learned about code in the first place... by "stealing" all the snippets they can get their curl hands on.

replies(2): >>45060251 #>>45068003 #

2. astrange ◴[29 Aug 25 04:37 UTC] No.45060251[source]▶

>>45059762 (TP) #

And by reading the docs, and by autogenerating code samples and testing them against verifiers, and by paying a lot of people to write sample code for sample questions.

replies(1): >>45060745 #

3. troupo ◴[29 Aug 25 06:07 UTC] No.45060745[source]▶

>>45060251 #

Yeah, none of that happened with LLMs

replies(1): >>45060994 #

4. khafra ◴[29 Aug 25 06:48 UTC] No.45060994{3}[source]▶

>>45060745 #

https://openai.com/index/prover-verifier-games-improve-legib... OpenAI has been doing verifier-guided training since last year. No SOTA model was trained without verified reward training for math and programming.

replies(1): >>45061047 #

5. troupo ◴[29 Aug 25 06:58 UTC] No.45061047{4}[source]▶

>>45060994 #

Your claim: "by reading the docs, and by autogenerating code samples and testing them against verifiers, and by paying a lot of people to write sample code for sample questions."

Your link: "Grade school math problems from a hardcoded dataset with hardcoded answers" [1]

It really is the same thing.

[1] https://openai.com/index/solving-math-word-problems/

--- start quote ---

GSM8K consists of 8.5K high quality grade school math word problems. Each problem takes between 2 and 8 steps to solve, and solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − × ÷) to reach the final answer.

--- end quote ---

replies(1): >>45061306 #

6. khafra ◴[29 Aug 25 07:42 UTC] No.45061306{5}[source]▶

>>45061047 #

My two claims:

1. OpenAI has been doing verifier-guided training since last year.

2. No SOTA model was trained without verified reward training for math and programming.

I supported the first claim with a document describing what OpenAI was doing last year; the extrapolation should have been straightforward, but it's easy for people who aren't tracking AI progress to underestimate the rate at which it occurs. So, here's some support for my second claim:

https://arxiv.org/abs/2507.06920 https://arxiv.org/abs/2506.11425 https://arxiv.org/abs/2502.06807

replies(1): >>45063209 #

7. troupo ◴[29 Aug 25 12:29 UTC] No.45063209{6}[source]▶

>>45061306 #

> the extrapolation should have been straightforward,

Indeed."By late next month you'll have over four dozen husbands" https://xkcd.com/605/

> So, here's some support for my second claim:

I don't think any of these links support the claim that "No SOTA model was trained without verified reward training for math and programming"

https://arxiv.org/abs/2507.06920: "We hope this work contributes to building a scalable foundation for reliable LLM code evaluation"

https://arxiv.org/abs/2506.11425: A custom agent with a custom environment and a custom training dataset on ~800 predetermined problems. Also "Our work is limited to Python"

https://arxiv.org/abs/2502.06807: The only one that somewhat obliquely refers to you claim

8. jama211 ◴[29 Aug 25 18:56 UTC] No.45068003[source]▶

>>45059762 (TP) #

Ah the classic “I don’t want to acknowledge how right that person is about their point, so instead I’ll ignore what they said and divert attention to another point entirely”.

You’re just angry and adding no value to this conversation because of it

↑