News Score: Score the News, Sort the News, Rewrite the Headlines

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

View PDF HTML (experimental) Abstract:Large language model (LLM)-based embedding models, benefiting from large scale pre-training and post-training, have begun to surpass BERT and T5-based models on general-purpose text embedding tasks such as document retrieval. However, a fundamental limitation of LLM embeddings lies in the unidirectional attention used during autoregressive pre-training, which misaligns with the bidirectional nature of text embedding tasks. To this end, We propose adopting di...

Read more at arxiv.org

© News Score  score the news, sort the news, rewrite the headlines