I built a vulnerable app and spent $1,500 seeing if LLMs could hack it
As a part of my work I do security research for various apps and websites. I wanted to see if LLMs could reproduce a common class of exploits I’ve found in multiple apps.
I made a fake React Native app in Expo and a backend in Python. It’s a book review app and the goal is to find a flag in a user’s private reviews.
If you would like to try solving it yourself before I spoil it, here’s a ZIP of the APK and challenge description each LLM was fed.
It looks like this:
Full exploit details (spoilers...
Read more at kasra.blog