there is also https://ds4sd.github.io/docling/ from ibm research which is mit license and track bounding boxes as rich json format
replies(1):
If it's superior (esp. for scans with text flowing around image boxes), and if you do end up packaging it up for brew, know that there's at least one developer who will benefit from your work (for a side-project, but that goes without saying).
Thanks in advance!