interesting, who (why?) is using and even paying for this service?
1. if you are on device, then use on device OCR (e.g. use Apple Vision directly)
2. if you are on cloud, then self-deployed OCR models
3. if you are on browser, then WASM/local self-deployed OCR models