←back to thread

858 points colesantiago | 5 comments | | HN request time: 0.92s | source
Show context
supernova87a ◴[] No.45109304[source]
By the way, a pet peeve of mine right now is that reporters covering court cases (and we have so many of public interest lately) never seem to simply paste the link to the online PDF decision/ruling for us all to read, right in the story. (and another user here kindly did that for us below: https://storage.courtlistener.com/recap/gov.uscourts.dcd.223... )

It seems such a simple step (they must have been using the ruling PDF to write the story) yet why is it always such a hassle for them to feel that they should link the original content? I would rather be able to see the probably dozens of pages ruling with the full details rather than hear it secondhand from a reporter at this point. It feels like they want to be the gatekeepers of information, and poor ones at that.

I think it should be adopted as standard journalistic practice in fact -- reporting on court rulings must come with the PDF.

Aside from that, it will be interesting to see on what grounds the judge decided that this particular data sharing remedy was the solution. Can anyone now simply claim they're a competitor and get access to Google's tons of data?

I am not too familiar with antitrust precedent, but to what extent does the judge rule on how specific the data sharing need to be (what types of data, for what time span, how anonymized, etc. etc.) or appoint a special master? Why is that up to the judge versus the FTC or whoever to propose?

replies(34): >>45109436 #>>45109441 #>>45109478 #>>45109479 #>>45109490 #>>45109518 #>>45109532 #>>45109624 #>>45109811 #>>45109851 #>>45110077 #>>45110082 #>>45110294 #>>45110366 #>>45110367 #>>45110536 #>>45110690 #>>45110834 #>>45111086 #>>45111256 #>>45111423 #>>45111626 #>>45112443 #>>45112591 #>>45112729 #>>45112898 #>>45112978 #>>45113292 #>>45113388 #>>45113710 #>>45114506 #>>45115131 #>>45115340 #>>45116045 #
Workaccount2 ◴[] No.45109532[source]
Never link outside your domain has been rule #1 of the ad-driven business for years now.

Once users leave your page, they become exponentially less likely to load more ad-ridden pages from your website.

Ironically this is also why there is so much existential fear about AI in the media. LLMs will do to them what they do to primary sources (and more likely just cut them out of the loop). This Google story will get a lot of clicks. But it is easy to see a near future where an AI agent just retrieves and summarizes the case for you. And does a much better job too.

replies(8): >>45109781 #>>45110067 #>>45110073 #>>45110474 #>>45110625 #>>45112026 #>>45112931 #>>45113046 #
bc569a80a344f9c ◴[] No.45110067[source]
> But it is easy to see a near future where an AI agent just retrieves and summarizes the case for you. And does a much better job too.

I am significantly less confident that an LLM is going to be any good at putting a raw source like a court ruling PDF into context and adequately explain to readers why - and what details - of the decision matter, and what impact they will have. They can probably do an OK job summarizing the document, but not much more.

I do agree that given current trends there is going to be significant impact to journalism, and I don’t like that future at all. Particularly because we won’t just have less good reporting, but we won’t have any investigative journalism, which is funded by the ads from relatively cheap “reporting only” stories. There’s a reason we call the press the fourth estate, and we will be much poorer without them.

There’s an argument to be made that the press has recently put themselves into this position and hasn’t done a great job, but I still think it’s going to be a rather great loss.

replies(6): >>45110388 #>>45110492 #>>45110912 #>>45111528 #>>45111538 #>>45111767 #
1. halJordan ◴[] No.45110912[source]
Llms are already incredibly able to be great at contextualizing and explaining things. HNs is so allergic to AI, it's incredible. And leaving you behind
replies(5): >>45111058 #>>45111167 #>>45111472 #>>45113026 #>>45114150 #
2. bongodongobob ◴[] No.45111058[source]
I think it's a mix of shortsightedness and straight up denial. A lot of people on here were the smart nerdy kid. They are good at programming or electronics or whatever. It became their identity and they are fuckin scared that the one thing they can do well will be taken away rather than putting the new tool in their toolbox.
3. bc569a80a344f9c ◴[] No.45111167[source]
They are. I use LLMs. They need to be given context. Which is easy for things that are already on the Internet for them to pull from. When people stop writing news articles that connect events to one another then LLMs have nothing to pull into their context. They are not capable of connecting two random sources.

Edit: also, the primary point is that if everyone uses LLMs for reporting, the loss of revenue will cause the disappearance of the investigative journalism that funds, which LLMs sure as fuck aren’t going to do.

replies(1): >>45111561 #
4. ragequittah ◴[] No.45111561[source]
Is this article investigative? Summary of the court case pdf is trivial for an LLM and most will probably do a better job than the linked article. Main difference being you won't be bombarded with ads and other nonsense (at least for now). Hell I wouldn't be surprised if the reporter had an LLM summarize the case before they wrote the article.

Content that can't be easily made by an LLM will still be worth something. But go to most news sites and their content is mostly summarization of someone else's content. LLMs may make that a hard sell.

5. grues-dinner ◴[] No.45113026[source]
The problem I may have with using an LLM for this is that I am not already familiar with the subject in detail and won't know when the thing has:

* Strayed from reality

* Strayed from the document and is freely admixing with other information from its training data without saying so. Done properly, this is a powerful tool for synthesis, and LLMs theoretically are great at it, but done improperly it just muddles things

* Has some kind of bias baked in-ironic mdash-"in summary, this ruling is an example of judicial overreach by activist judges against a tech company which should morally be allowed to do what they want". Not such a problem now, but I think we may see more of this once AI is firmly embedded into every information flow. Currently the AI company game is training people to trust the machine. Once they do, what a resource those people become!

Now, none of those points are unique to LLMs: inaccuracy, misunderstanding, wrong or confused synthesis and especially bias are all common in human journalism. Gell-Mann amnesia and institutional bias and all that.

Perhaps the problem is that I'm not sufficiently mistrustful of the status quo, even though I am already quite suspicious of journalistic analysis. Or maybe it's because AI, though my brain screams "don't trust it, check everything, find the source", remains in the toolbox even when I find problems, whereas for a journalist I'd roll my eyes, call them a hack and leave the website.

Not that it's directly relevant to the immediate utility of AI today, but once AI is everything, or almost everything, then my next worry is what happens when you functionally only have published primary material and AI output to train on. Even without model collapse, what happens when AI journobots inherently don't "pick up the phone", so to speak, to dig up details? For the first year, the media runs almost for free. For the second year, there's no higher level synthesis for the past year to lean on and it all regresses to summarising press releases. Again, there are already many human publications that just repackage PRs, but when that's all there is? This problem isn't limited to journalism, but it's a good example.