Performance of the Python 3.14 tail-call interpreter
About a month ago, the CPython project merged a new implementation strategy for their bytecode interpreter. The initial headline results were very impressive, showing a 10-15% performance improvement on average across a wide range of benchmarks across a variety of platforms.
Unfortunately, as I will document in this post, these impressive performance gains turned out to be primarily due to inadvertently working around a regression in LLVM 19. When benchmarked against a better baseline (such GCC,...
Read more at blog.nelhage.com