"GitHub Introduces Zerox, Advanced OCR Technology Leveraging GPT-4O-Mini, Providing High-Quality Document Processing at Competitive Price"

GitHub - getomni-ai/zerox: Zero shot pdf OCR with gpt-4o-mini

Zerox OCR A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense! The general logic: Pass in a PDF (URL or file buffer) Turn the PDF into a series of images Pass each image to GPT and ask nicely for Markdown Aggregate the responses and return Markdown Sounds pretty basic! But with the gpt-4o-mini this method is price competitive with existing products, with m...