←back to thread

423 points sohkamyung | 1 comments | | HN request time: 0s | source
Show context
iainctduncan ◴[] No.45670881[source]
I'm curious how many people have actually taken the time to compare AI summaries with sources they summarize. I did for a few and ... it was really bad. In my experience, they don't summarize at all, they do a random condensation.. not the same thing at all. In one instance I looked at the result was a key takeaway being the opposite of what it should have been. I don't trust them at all now.
replies(10): >>45671039 #>>45671541 #>>45671813 #>>45672108 #>>45672572 #>>45672678 #>>45673123 #>>45674739 #>>45674888 #>>45675283 #
1. dcre ◴[] No.45671541[source]
In my experience there is a big difference between good models and weak ones. Quick test with this long article I read recently: https://www.lawfaremedia.org/article/anna--lindsey-halligan-...

The command I ran was `curl -s https://r.jina.ai/https://www.lawfaremedia.org/article/anna-... | cb | ai -m gpt-5-mini summarize this article in one paragraph`. r.jina.ai pulls the text as markdown, and cb just wraps in a ``` code fence, and ai is my own LLM CLI https://github.com/david-crespo/llm-cli.

All of them seem pretty good to me, though at 6 cents the regular use of Sonnet for this purpose would be excessive. Note that reasoning was on the default setting in each case. I think that means the gpt-5 mini one did no reasoning but the other two did.

GPT-5 one paragraph: https://gist.github.com/david-crespo/f2df300ca519c336f9e1953...

GPT-5 three paragraphs: https://gist.github.com/david-crespo/d68f1afaeafdb68771f5103...

GPT-5 mini one paragraph: https://gist.github.com/david-crespo/32512515acc4832f47c3a90...

GPT-5 mini three paragraphs: https://gist.github.com/david-crespo/ed68f09cb70821cffccbf6c...

Sonnet 4.5 one paragraph: https://gist.github.com/david-crespo/e565a82d38699a5bdea4411...

Sonnet 4.5 three paragraphs: https://gist.github.com/david-crespo/2207d8efcc97d754b7d9bf4...