Researchers Develop Framework to Quantify Uncertainty in LLM-Generated Formal Specifications, Reducing Errors by up to 100%

Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

View PDF HTML (experimental) Abstract:Large language models (LLMs) show remarkable promise for democratizing automated reasoning by generating formal specifications. However, a fundamental tension exists: LLMs are probabilistic, while formal verification demands deterministic guarantees. This paper addresses this epistemological gap by comprehensively investigating failure modes and uncertainty quantification (UQ) in LLM-generated formal artifacts. Our systematic evaluation of five frontier LLMs...