GitHub - o40/seesay: Live image description solution using ESP32-CAM + Phone + Server
Introduction
I wanted to see if I could create a low-cost tool for the blind to get live description of the scene in front of a camera.
The idea is to have images taken at a set interval, which are then described using an AI model, and read back to the user using voice synthesis.
Since I was going for low cost (<30$), and wanted to learn more about software development on arduino, I bought a ESP32-CAM with built-in WiFi to capture the images.
To describe the image I selected the gpt-4o-mini mode...
Read more at github.com