News Score: Score the News, Sort the News, Rewrite the Headlines

Can Large Vision Language Models Read Maps Like a Human?

View PDF HTML (experimental) Abstract:In this paper, we introduce MapBench-the first dataset specifically designed for human-readable, pixel-based map-based outdoor navigation, curated from complex path finding scenarios. MapBench comprises over 1600 pixel space map path finding problems from 100 diverse maps. In MapBench, LVLMs generate language-based navigation instructions given a map image and a query with beginning and end landmarks. For each map, MapBench provides Map Space Scene Graph (MS...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines