Summing ASCII encoded integers on Haswell at almost the speed of memcpy
“Print the sum of 50 million ASCII-encoded integers uniformly sampled from [0, 2³¹−1], separated by a single new line and sent to standard input.”
On the surface, a trivial problem. But what if you wanted to go as fast as possible?
I’m currently one of the top ranked competitors in exactly that kind of challenge and in this post I’ll show you a sketch of my best performing solution. I’ll leave out some of the µoptimizations and look-up table generation to keep this post short, easier to understa...
Read more at mattstuchlik.com