Researchers Discover In-Context Learning in Masked Language Models, Challenging GPT Dominance

BERTs are Generative In-Context Learners

View PDF HTML (experimental) Abstract:While in-context learning is commonly associated with causal language models, such as GPT, we demonstrate that this capability also 'emerges' in masked language models. Through an embarrassingly simple inference technique, we enable an existing masked model, DeBERTa, to perform generative tasks without additional training or architectural changes. Our evaluation reveals that the masked and causal language models behave very differently, as they clearly outpe...