GitHub - iterative/datachain: AI-data warehouse to enrich, transform and analyze data from cloud storages
DataChain
DataChain is a modern Pythonic data-frame library designed for artificial intelligence.
It is made to organize your unstructured data into datasets and wrangle it at scale on
your local machine. Datachain does not abstract or hide the AI models and API calls, but helps to integrate them into the postmodern data stack.
Key Features
📂 Storage as a Source of Truth.
Process unstructured data without redundant copies from S3, GCP, Azure, and local
file systems.
Multimodal data support: ima...
Read more at github.com