MapBench: New Dataset Challenges AI's Map-Reading Skills, Reveals Limitations in Spatial Reasoning

Can Large Vision Language Models Read Maps Like a Human?

View PDF HTML (experimental) Abstract:In this paper, we introduce MapBench-the first dataset specifically designed for human-readable, pixel-based map-based outdoor navigation, curated from complex path finding scenarios. MapBench comprises over 1600 pixel space map path finding problems from 100 diverse maps. In MapBench, LVLMs generate language-based navigation instructions given a map image and a query with beginning and end landmarks. For each map, MapBench provides Map Space Scene Graph (MS...