/top/
/new/
/best/
/ask/
/show/
/job/
^
slacker news
login
about
←back to thread
PDF to Text, a challenging problem
(www.marginalia.nu)
357 points
ingve
| 1 comments |
13 May 25 15:01 UTC
|
HN request time: 0.001s
|
source
Show context
xnx
◴[
13 May 25 15:44 UTC
]
No.
43974208
[source]
▶
>>43973721 (OP)
#
Weird that there's no mention of LLMs in this article even though the article is very recent. LLMs haven't solved every OCR/document data extraction problem, but they've dramatically improved the situation.
replies(5):
>>43974229
#
>>43974325
#
>>43974337
#
>>43974562
#
>>43975686
#
1.
constantinum
◴[
13 May 25 17:50 UTC
]
No.
43975686
[source]
▶
>>43974208
#
True indeed, but there are a few problems — hallucinations and trusting the output(validation). More here
https://unstract.com/blog/why-llms-struggle-with-unstructure...
ID:
GO
↑