←back to thread

GPT-5.2

(openai.com)
1019 points atgctg | 5 comments | | HN request time: 0s | source
Show context
tenpoundhammer ◴[] No.46236826[source]
I have been using chatGPT a ton over the last months and paying the subscription. Used it for coding, news, stock analysis, daily problems, and a whatever I could think of. I decided to give Gemini a go when version three came out to great reviews. Gemini handles every single one of my uses cases much better and consistently gives better answers. This is especially true for situations were searching the web for current information is important, makes sense that google would be better. Also OCR is phenomenal chatgpt can't read my bad hand writing but Gemini can easily. Only downsides are in the polish department, there are more app bugs and I usually have to leave the happen or the session terminates. There are bugs with uploading photos. The biggest complaint is that all links get inserted into google search and then I have to manipulate them when they should go directly to the chosen website, this has to be some kind of internal org KPI nonsense. Overall, my conclusion is that ChatGPT has lost and won't catch up because of the search integration strength.
replies(36): >>46236861 #>>46236896 #>>46236956 #>>46236971 #>>46236980 #>>46237123 #>>46237253 #>>46237258 #>>46237321 #>>46237407 #>>46237452 #>>46237531 #>>46237626 #>>46237654 #>>46237786 #>>46237888 #>>46237927 #>>46238237 #>>46238324 #>>46238527 #>>46238546 #>>46238828 #>>46239189 #>>46239400 #>>46239512 #>>46239719 #>>46239767 #>>46239999 #>>46240382 #>>46240656 #>>46240742 #>>46240760 #>>46240763 #>>46241303 #>>46241326 #>>46241523 #
dmd ◴[] No.46237258[source]
I consistently have exactly the opposite experience. ChatGPT seems extremely willing to do a huge number of searches, think about them, and then kick off more searches after that thinking, think about it, etc., etc. whereas it seems like Gemini is extremely reluctant to do more than a couple of searches. ChatGPT also is willing to open up PDFs, screenshot them, OCR them and use that as input, whereas Gemini just ignores them.
replies(5): >>46237338 #>>46237556 #>>46237747 #>>46240627 #>>46241115 #
nullbound ◴[] No.46237338[source]
I will say that it is wild, if not somewhat problematic that two users have such disparate views of seemingly the same product. I say that, but then I remember my own experience just from few days ago. I don't pay for gemini, but I have paid chatgpt sub. I tested both for the same product with seemingly same prompt and subbed chatgpt subjectively beat gemini in terms of scope, options and links with current decent deals.

It seems ( only seems, because I have not gotten around to test it in any systematic way ) that some variables like context and what the model knows about you may actually influence quality ( or lack thereof ) of the response.

replies(10): >>46237530 #>>46237782 #>>46238005 #>>46238426 #>>46238540 #>>46238609 #>>46238817 #>>46238824 #>>46239808 #>>46240331 #
1. blks ◴[] No.46238540[source]
Because neither product has any consistency in its results, no predictive behaviour. One day it performs well, another it hallucinates non existing facts and libraries. Those are stochastic machines
replies(1): >>46238992 #
2. sendes ◴[] No.46238992[source]
I see the hyperbole is the point, but surely what these machines do is to literally predict? The entire prompt engineering endeavour is to get them to predict better and more precisely. Of course, these are not perfect solutions - they are stochastic after all, just not unpredictably.
replies(1): >>46240345 #
3. coliveira ◴[] No.46240345[source]
Prompt engineering is voodoo. There's no sure way to determine how well these models will respond to a question. Of course, giving additional information may be helpful, but even that is not guaranteed.
replies(2): >>46241557 #>>46241667 #
4. lossyalgo ◴[] No.46241557{3}[source]
Also every model update changes how you have to prompt them to get the answers you want. Setting up pre-prompts can help, but with each new version, you have to figure out through trial and error how to get it to respond to your type of queries.

I can't wait to see how bad my finally sort-of-working ChatGPT 5.1 pre-prompts work with 5.2.

Edit: How to talk to these models is actually documented, but you have to read through huge documents: https://cdn.openai.com/gpt-5-system-card.pdf

5. baq ◴[] No.46241667{3}[source]
It definitely isn’t voodoo, it’s more like forecasting weather. Some forecasts are easier to make, some are harder (it’ll be cold when it’s winter vs the exact location and wind speed of a tornado for an extreme example). The difference is you can try to mix things up in the prompt to maximize the likelihood of getting what you want out and there are feasibility thresholds for use cases, e.g. if you get a good answer 95% of the time it’s qualitatively different than 55%.