News Score: Score the News, Sort the News, Rewrite the Headlines

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

View PDF HTML (experimental) Abstract:Large language models (LLMs) show remarkable promise for democratizing automated reasoning by generating formal specifications. However, a fundamental tension exists: LLMs are probabilistic, while formal verification demands deterministic guarantees. This paper addresses this epistemological gap by comprehensively investigating failure modes and uncertainty quantification (UQ) in LLM-generated formal artifacts. Our systematic evaluation of five frontier LLMs...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines