How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
View PDF
HTML (experimental)
Abstract:Large language models (LLMs) have attracted significant attention due to their impressive general capabilities across diverse downstream tasks. However, without domain-specific optimization, they often underperform on specialized knowledge benchmarks and even produce hallucination. Recent studies show that strategically infusing domain knowledge during pretraining can substantially improve downstream performance. A critical challenge lies in balancing this i...
Read more at arxiv.org