Researchers Unveil Scaling Law for Optimal Knowledge Infusion in Large Language Models, Balancing Specialization and Memory Retention

How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models

View PDF HTML (experimental) Abstract:Large language models (LLMs) have attracted significant attention due to their impressive general capabilities across diverse downstream tasks. However, without domain-specific optimization, they often underperform on specialized knowledge benchmarks and even produce hallucination. Recent studies show that strategically infusing domain knowledge during pretraining can substantially improve downstream performance. A critical challenge lies in balancing this i...