Otherwise there's VSCodium which is what I'm using until I can make the jump to Code Edit.
Why wouldn't it?
In Neovim the choice of language server and the choice of LLM is up to the user, (possibly even the choice of this API, I believe, having only skimmed the PR) while both of those choices are baked in to XCode, so they're not the same thing.
They certainly do, and I can't really follow the analogy you are building.
> We're at a higher level of abstraction now.
To me, an abstraction higher than a programming language would be natural language or some DSL that approximates it.
At the moment, I don't think most people using LLMs are reading paragraphs to maintain code. And LLMs aren't producing code in natural language.
That isn't abstraction over language, it is an abstraction over your computer use to make the code in language. If anything, you are abstracting yourself away.
Furthermore, if I am following you, you are basically saying, you have to make a call to a (free or paid) model to explain your code every time you want to alter it.
I don't know how insane that sounds to most people, but to me, it sounds bat-shit.
Your link: "Grade school math problems from a hardcoded dataset with hardcoded answers" [1]
It really is the same thing.
[1] https://openai.com/index/solving-math-word-problems/
--- start quote ---
GSM8K consists of 8.5K high quality grade school math word problems. Each problem takes between 2 and 8 steps to solve, and solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − × ÷) to reach the final answer.
--- end quote ---
If you don’t want to use LLM coding assistants – or if you can’t, or it’s not a technology suitable for your work – nobody cares. It’s totally fine. You don’t need to get performatively enraged about it.
1. OpenAI has been doing verifier-guided training since last year.
2. No SOTA model was trained without verified reward training for math and programming.
I supported the first claim with a document describing what OpenAI was doing last year; the extrapolation should have been straightforward, but it's easy for people who aren't tracking AI progress to underestimate the rate at which it occurs. So, here's some support for my second claim:
https://arxiv.org/abs/2507.06920 https://arxiv.org/abs/2506.11425 https://arxiv.org/abs/2502.06807
Since the landscape of potentially malicious inputs in plain english is practically infinite, without any particular enforced structure for the queries you make of it, means that those "guardrails" are, in effect, an expert system. An ever growing pile of if-then statements. Didn't work then, won't work now.
I have used agentic coding tools to solve problems that have literally never been solved before, and it was the AI, not me, that came up with the answer.
If you look under the hood, the multi-layered percqptratrons in the attention heads of the LLM are able to encode quite complex world models, derived from compressing its training set in a which which is formally as powerful as reasoning. These compressed model representations are accessible when prompted correctly, which express as genuinely new and innovative thoughts NOT in the training set.
Would you show us? Genuinely asking
Indeed."By late next month you'll have over four dozen husbands" https://xkcd.com/605/
> So, here's some support for my second claim:
I don't think any of these links support the claim that "No SOTA model was trained without verified reward training for math and programming"
https://arxiv.org/abs/2507.06920: "We hope this work contributes to building a scalable foundation for reliable LLM code evaluation"
https://arxiv.org/abs/2506.11425: A custom agent with a custom environment and a custom training dataset on ~800 predetermined problems. Also "Our work is limited to Python"
https://arxiv.org/abs/2502.06807: The only one that somewhat obliquely refers to you claim
Ask the best available models -- emphasis on models -- for help designing the text editor at a structural rather than functional level first, being specific about what you want and emphasizing component-level test whenever possible, and only then follow up with actual code generation, and you'll get much better results.
You’re just angry and adding no value to this conversation because of it
Editors are incredibly complex and require domain knowledge to guide agents toward the correct architecture and implementation (and away from the usual naive pitfalls), but in my experience the latest models reason about and implement features/changes just fine.
Obviously no model is going to one-shot something like a full text editor, but there's an ocean of difference between defining vibe coding as prompting "Make me a text editor" versus spending days/weeks going back and forth on architecture and implementation with a model while it's implementing things bottom-up.
Both seem like common definitions of the term, but only one of them will _actually_ work here.
It’s happened now that a couple of times it pops out novel results. In computational chemistry, machine learned potentials trained with transformer models have already resulted in publishable new chemistry. Those papers are t out yet, but expect them within a year.