Intel Silently Merges New AVX-512 Quicksort Library, Up To 17x Improvement

Jason R. Wilson
Image source: NumPy via J.Wilson, Wccftech.

NumPy, or Numerical Python, is one of the Python libraries that focuses on scientific computing in the well-known coding language and has recently integrated Intel's C++ header file library used for quicksort in AVX-512. The new integration shows increased speeds of ten to seventeen times faster SIMD-based sorting.

Intel's NumPy Switching introduces AVX-512 to increase SIMD-based sorting & assist with performance

NumPy's library, based in Python, is explained as providing:

...a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

— according to the official NumPy project website.

Intel uploaded the x86-simd-sort onto the company's GitHub to supply users with a C++ header file library to assist with SIMD sorting at a higher-performing level. Raghuveer Devulapalli, one of the Intel engineers, was crucial in integrating the x86-simd-sort code into NumPy. However, the file library only focuses on AVX-512 and its quick sort inclusion.

[The new x86-simd-sort is a] C++ header file library for SIMD based 16-bit, 32-bit and 64-bit data type sorting on x86 processors. Source header files are available in src directory. We currently only have AVX-512 based implementation of quicksort. This repository also includes a test suite which can be built and run to test the sorting algorithms for correctness. It also has benchmarking code to compare its performance relative to std::sort.

Michael Larabel, Linux analyst and editor of the website Phoronix, states that the results are incredibly favorable, where the increase in sorting with AVX-512 assisted the project enhanced performance between ten to seventeen times.

Larabel notes that PR 22315 was introduced into NumPy to "vectorized the quicksort for 16-bit and 64-bit data types" bf the AVX-512 integration. He continues that Tiger Lake-based systems, specifically ones that use the 11th Gen Tiger Lake i7-1165G7, witnessed the highest speed in 16-bit int sorting (seventeen times better). In contrast, 64-bit float sorting received the lowest (ten times increased). Lastly, 32-bit data types and random arrays did see an improvement of twelve to thirteen times increased sorting capability. You can see the results of the benchmarks here.

News Sources: Phoronix, Intel GitHub 1, 2

Jason R. Wilson Photo

About the author: Jason R. Wilson is a member of the Hardware news team at Wccftech. Equipped with a background in graphic design and writing, Jason works daily to improve his craft and continues to create new and innovative ideas every day.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Deal of the Day

Button