News Score: Score the News, Sort the News, Rewrite the Headlines

Bit-permuting 16 u32s at once with AVX-512

The basic trick to apply the same bit-permutation to each of the u32s is to view them as matrix of 16 rows by 32 columns, transpose it into a 32 u16s, permute those u16s in the same way that we wanted to permute the bits of the u32s [1], then transpose back to 16 u32s. Easy: __m512i permbits_16x32(__m512i data, __m512i indices) { __m512i x = data; x = transpose_16_dwords_to_32_words(x); x = _mm512_permutexvar_epi16(indices, x); x = transpose_32_words_to_16_dwords(x); return x; } transpose_16_dwo...

Read more at bitmath.blogspot.com

© News Score  score the news, sort the news, rewrite the headlines