Optimization adventures: making a parallel Rust workload 10x faster with (or without) Rayon | Blog | Guillaume Endignoux
In a previous post, I’ve shown how to use the rayon framework in Rust to automatically parallelize a loop computation across multiple CPU cores.
Disappointingly, my benchmarks showed that this only provided a 2x speedup for my workload, on a computer with 8 CPU threads.
Worse, the total “user” and “system” times increased linearly with the number of threads, meaning potentially more wasted work.
Even Python was only twice slower than my Rust code, when Rust is typically 10x to 100x faster than P...
Read more at gendignoux.com