What an understatement. It has me thinking „man, fuck this“ on the daily.
Just today it spontaneously lost an entire 20-30 minutes long thread and it was far from the first time. It basically does it any time you interrupt it in any way. It’s straight up data loss.
It’s kind of a typical Google product in that it feels more like a tech demo than a product.
It has theoretically great tech. I particularly like the idea of voice mode, but it’s noticeably glitchy, breaks spontaneously often and keeps asking annoying questions which you can’t make it stop.
And the UI lack of polish shows up freshly every time a new feature lands too - the "branch in new chat" feature is really finicky still, getting stuck in an unusable state if you twitch your eyebrows at wrong moment.
With Gemini, it will send as soon as I stop to think. No way to disable that.
To posit a scenario: I would expect General Motors to buy some Ford vehicles to test and play around with and use. There's always stuff to learn about what the competition has done (whether right, wrong, or indifferent).
But I also expect the parking lots used by employees at any GM design facility in the world to be mostly full of General Motors products, not Fords.
That's sometimes me with the CLI. I can't use the Gemini CLI right now on Windows (in the Terminal app), because trying to copy in multiple lines of text for some reason submits them separately and it just breaks the whole thing. OpenCode had the same issue but even worse, it quite after the first line or something and copied the text line by line into the shell, thank fuck I didn't have some text that mentions rm -rf or something.
More info: https://github.com/google-gemini/gemini-cli/issues/14735#iss...
At the same time, neither Codex CLI, nor Claude Code had that issue (and both even showed shortened representations of copied in text, instead of just dumping the whole thing into the input directly, so I could easily keep writing my prompt).
So right now if I want to use Gemini, I more or less have to use something like KiloCode/RooCode/Cline in VSC which are nice, but might miss out on some more specific tools. Which is a shame, because Gemini is a really nice model, especially when it comes to my language, Latvian, but also your run of the mill software dev tasks.
In comparison, Codex feels quite slow, whereas Claude Code is what I gravitate towards most of the time but even Sonnet 4.5 ends up being expensive when you shuffle around millions of tokens: https://news.ycombinator.com/item?id=46216192 Cerebras Code is nice for quick stuff and the sheer amount of tokens, but in KiloCode/... regularly messes up applying diff based edits.
I think you'd be surprised about the vehicle makeup at Big 3 design facilities.
it's like the client, not the server, is responsible for writing to my conversation history or something
Copilot Chat has been perfect in this respect. It's currently GPT 5.0, moving to 5.1 over the next month or so, but at least I've never lost an (even old) conversation since those reside in an Exchange mailbox.
works great for kicking off a request and closing tab or navigating away to another page in my app to do something.
i dont understand why model providers dont build this resilient token streaming into all of their APIs. would be a great feature
But voice is not a huge traffic funnel. Text is. And the verdict is more or less unanimous at this time. Gemini 3.0 has outdone ChatGPT. I unsubscribed from GPT plus today. I was a happy camper until the last month when I started noticing deplorable bugs.
1. The conversation contexts are getting intertwined.Two months ago, I could ask multiple random queries in a conversation and I would get correct responses but the last couple of weeks, it's been a harrowing experience having to start a new chat window for almost any change in thread topic. 2. I had asked ChatGPT to once treat me as a co-founder and hash out some ideas. Now for every query - I get a 'cofounder type' response. Nothing inherently wrong but annoying as hell. I can live with the other end of the spectrum in which Claude doesn't remember most of the context.
Now that Gemini pro is out, yes the UI lacks polish, you can lose conversations, but the benefits of low latency search and a one year near free subscription is a clincher. I am out of ChatGPT for now, 5.2 or otherwise. I wish them well.
https://www.caranddriver.com/news/a62694325/ford-ceo-jim-far...
Codex is decent and seemed to be improving (being written in rust helps). Claude code is still the king, but my god they have server and throttling issues.
Mixed bag wherever you go. As model progress slows / flatlines (already has?) I’m sure we’ll see a lot more focus and polish on the interfaces.
People who can’t understand that many people actually prefer iOS use this green/blue thing to explain the otherwise incomprehensible (to them) phenomenon of high iOS market share. “Nobody really likes iOS, they just get bullied at school if they don’t use it”.
It’s just “wake up sheeple” dressed up in fake morality.
Same way many professional airplane mechanics fly commercial rather than building their own plane. Just because your job is in tech doesn’t mean you have to be ultra-haxxor with every single device in your life.
I use a modeling software called Rhino on wine on Linux. In the past, there was an incident where I had to copy an obscure dll that couldn't be delivered by wine or winetricks from a working Windows installation to get something to work. I did so and it worked. (As I recall this was a temporary issue, and was patched in the next release of wine.)
I hate the wine standard file picker, it has always been a persistent issue with Rhino3d. So I keep banging my head on trying to get it to either perform better or make a replacement. Every few months I'll get fed up and have a minute to kill, so I'll see if some new approach works. This time, ChatGPT told me to copy two dll's from a working windows installation to the System folder. Having precedent that this can work, I did.
Anyway, it borked startup completely and it took like an hour to recover. What I didn't consider - and I really, really should have - was that these were dll's that were ALREADY IN the system directory, and I was overwriting the good ones with values already reflecting my system with completely foreign ones.
And that's the critical difference - the obscure dll that made the system work that one time was because of something missing. This time was overwriting extant good ones.
But the fact that the LLM even suggested (without special prompting) to do something that I should have realized was a stupid idea with a low chance of success made me very wary of the harm it could cause.
'Oh, that super annoying issue? Yeah, it's been there for years. We just don't do that.'
Fundamentally though, browsing the web on iOS, even with a custom "browser" with adblocking, feels like going back in time 15 years.
> ...that the LLM even suggested (without special prompting) to do something that I should have realized was a stupid idea with a low chance of success...
Since you're using other models instead, do you believe they cannot give similarly stupid ideas?
The MSRP of your phone does not matter.
Until you queried I had forgotten to mention that the same day I was trying to work out a Linux system display issue and it very confidently suggested to remove a package and all its dependencies, which would have removed all my video drivers. On reading the output of the autoremove command I pointed out that it had done this, and the model spat out an "apology" and owned up to ** the damage it would have wreaked.
** It can't "apologize" for or "own up" to anything, it can just output those words. So I hope you'll excuse the anthropomorphization.
And I've parked in the lot of shame at a Ford plant, as an outsider, in my GMC work truck -- way over there.
It wasn't so bad. A bit of a hike to go back and get a tool or something, but it was at least paved...unlike the non-union lot I'm familiar with at a P&G facility, which is a gravel lot that takes crossing a busy road to get to, lacks the active security and visibility from the plant that the union lot has, and which is full of tall weeds. At P&G, I half-expect to come back and find my tires slashed.
Anyway, it wasn't barren over there in the not-Ford lot, but it wasn't nearly so populous as the Ford lot was. The Ford-only lot is bigger, and always relatively packed.
It was very clear to me that the lots (all of the lots, in aggregate) were mostly full of Fords.
To bring this all back 'round: It is clear to me that Ford employees broadly (>50%) drive Fords to work at that plant.
---
It isn't clear to me at all that Google Pixel developers don't broadly drive iPhones. As far as I can tell, that status (which is meme-level in its age at this point) is true, and they aren't broadly making daily use of the systems they build.
(And I, for one, can't imagine spending 40 hours a week developing systems that I refuse to use. I have no appreciation for that level of apparent arrogance, and I hope to never be suaded to be that way. I'd like to think that I'd be better-motivated to improve the system than I would be to avoid using it and choose a competitor instead.
I don't shit where I sleep.)