GPT-5.2

(openai.com)

1019 points atgctg | 1 comments | 11 Dec 25 18:04 UTC | HN request time: 0s | source

https://platform.openai.com/docs/guides/latest-model

System card: https://cdn.openai.com/pdf/3a4153c8-c748-4b71-8e31-aecbde944...

Show context

tenpoundhammer ◴[11 Dec 25 20:39 UTC] No.46236826[source]▶

I have been using chatGPT a ton over the last months and paying the subscription. Used it for coding, news, stock analysis, daily problems, and a whatever I could think of. I decided to give Gemini a go when version three came out to great reviews. Gemini handles every single one of my uses cases much better and consistently gives better answers. This is especially true for situations were searching the web for current information is important, makes sense that google would be better. Also OCR is phenomenal chatgpt can't read my bad hand writing but Gemini can easily. Only downsides are in the polish department, there are more app bugs and I usually have to leave the happen or the session terminates. There are bugs with uploading photos. The biggest complaint is that all links get inserted into google search and then I have to manipulate them when they should go directly to the chosen website, this has to be some kind of internal org KPI nonsense. Overall, my conclusion is that ChatGPT has lost and won't catch up because of the search integration strength.

replies(36): >>46236861 #>>46236896 #>>46236956 #>>46236971 #>>46236980 #>>46237123 #>>46237253 #>>46237258 #>>46237321 #>>46237407 #>>46237452 #>>46237531 #>>46237626 #>>46237654 #>>46237786 #>>46237888 #>>46237927 #>>46238237 #>>46238324 #>>46238527 #>>46238546 #>>46238828 #>>46239189 #>>46239400 #>>46239512 #>>46239719 #>>46239767 #>>46239999 #>>46240382 #>>46240656 #>>46240742 #>>46240760 #>>46240763 #>>46241303 #>>46241326 #>>46241523 #

dmd ◴[11 Dec 25 21:14 UTC] No.46237258[source]▶

>>46236826 #

I consistently have exactly the opposite experience. ChatGPT seems extremely willing to do a huge number of searches, think about them, and then kick off more searches after that thinking, think about it, etc., etc. whereas it seems like Gemini is extremely reluctant to do more than a couple of searches. ChatGPT also is willing to open up PDFs, screenshot them, OCR them and use that as input, whereas Gemini just ignores them.

replies(5): >>46237338 #>>46237556 #>>46237747 #>>46240627 #>>46241115 #

nullbound ◴[11 Dec 25 21:22 UTC] No.46237338[source]▶

>>46237258 #

I will say that it is wild, if not somewhat problematic that two users have such disparate views of seemingly the same product. I say that, but then I remember my own experience just from few days ago. I don't pay for gemini, but I have paid chatgpt sub. I tested both for the same product with seemingly same prompt and subbed chatgpt subjectively beat gemini in terms of scope, options and links with current decent deals.

It seems ( only seems, because I have not gotten around to test it in any systematic way ) that some variables like context and what the model knows about you may actually influence quality ( or lack thereof ) of the response.

replies(10): >>46237530 #>>46237782 #>>46238005 #>>46238426 #>>46238540 #>>46238609 #>>46238817 #>>46238824 #>>46239808 #>>46240331 #

jhancock ◴[11 Dec 25 22:49 UTC] No.46238426[source]▶

>>46237338 #

I can use GPT one day and the next get a different experience with the same problem space. Same with Gemini.

replies(2): >>46238643 #>>46238793 #

4ndrewl ◴[11 Dec 25 23:10 UTC] No.46238643[source]▶

>>46238426 #

This is by design, given a non-determenitisic application?

replies(1): >>46238690 #

jhancock ◴[11 Dec 25 23:14 UTC] No.46238690[source]▶

>>46238643 #

sure. It may be more than that...possibly due to variable operating params on the servers and current load.

On whole, if I compare my AI assistant to a human worker, I get more variance than I would from a human office worker.

replies(2): >>46238899 #>>46239210 #

1. astrange ◴[12 Dec 25 00:07 UTC] No.46239210{3}[source]▶

>>46238690 #

They're not really capable of producing varying answers based on load.

But they are capable of producing different answers because they feel like behaving differently if the current date is a holiday, and things like that. They're basically just little guys.

↑