Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning
View PDF
HTML (experimental)
Abstract:Large Language Models (LLMs) have demonstrated impressive capability in many nature language tasks. However, the auto-regressive generation process makes LLMs prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. In this paper, we aim to alleviate the pathology by introducing Q*, a general, versatile and agile framework for guiding LLMs decoding process with deliberative planning. By learning a plug-and-pla...
Read more at arxiv.org