←back to thread

268 points lermontov | 3 comments | | HN request time: 0.601s | source
Show context
mmastrac ◴[] No.41906276[source]
I started a quick transcription here -- not enough time to complete more than half the first column, but some scans and very rough OCR are here if anyone is interested in contributing:

https://github.com/mmastrac/gibbet-hill

Top and bottom halves of the page in the repo here:

https://github.com/mmastrac/gibbet-hill/blob/main/scan-1.png https://github.com/mmastrac/gibbet-hill/blob/main/scan-2.png

EDIT: If you have access to a multi-modal LLM, the rough transcription + the column scan and the instruction to "OCR this text, keep linebreaks" gives a _very good_ result.

EDIT 2: Rough draft, needs some proofreading and corrections:

https://github.com/mmastrac/gibbet-hill/blob/main/story.md

replies(5): >>41906561 #>>41907098 #>>41907235 #>>41908097 #>>41908454 #
1. simonw ◴[] No.41906561[source]
I tried extracting the content using Google Gemini 1.5 Pro 002 using https://aistudio.google.com/ - the first page (scan-2) worked fantastically well, the second page not so much. Here's what I got so far: https://gist.github.com/simonw/ba87f507ef5c11d3335959c055533...
replies(1): >>41906687 #
2. mmastrac ◴[] No.41906687[source]
I cropped the columns out into six files -- it might have an easier time with these:

https://github.com/mmastrac/gibbet-hill/blob/main/col-1-a.pn...

replies(2): >>41907087 #>>41907203 #
3. reaperducer ◴[] No.41907203[source]
…and my wife's Halloween present has been printed.

Tip: Load the pngs into Preview, hit "Auto Levels," and crank up "Sharpness" on each one. Looks pretty good!