Turing Complete Transformers: Two Transformers Are More Powerful...
Keywords: transformers, computational complexity, computation, generalization, agents, multi-modelTL;DR: We prove transformers are not Turing complete, propose a new architecture that is Turing complete, and empirically demonstrate that the new architecture can generalize more effectively than transformers.Abstract: This paper presents Find+Replace transformers, a family of multi-transformer architectures that can provably do things no single transformer can, and which outperforms GPT-4 on sever...
Read more at openreview.net