GitHub - NVIDIA/nv-ingest: NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retrieval systems.
NVIDIA-Ingest: Multi-modal data extraction
NVIDIA-Ingest is a scalable, performance-oriented document content and metadata extraction microservice. Including support for parsing PDFs, Word and PowerPoint documents, it uses specialized NVIDIA NIM microservices to find, contextualize, and extract text, tables, charts and images for use in downstream generative applications.
NVIDIA Ingest enables parallelization of the process of splitting documents into pages where contents are classified (as tabl...
Read more at github.com