Optimizing Dgemm On Intel Cpus With Avx512f

Stepwise optimizations of DGEMM on CPU, reaching performance faster than Intel MKL eventually, even under multithreading.
Alternatives To Optimizing Dgemm On Intel Cpus With Avx512f
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
John9,572
6 days ago509C
John the Ripper jumbo - advanced offline password cracker, which supports hundreds of hash and cipher types, and runs on many operating systems, CPUs, GPUs, and even some FPGAs
Laser262
6 months agoDecember 13, 202318apache-2.0Nim
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Corrfunc156126 months ago16October 03, 202341mitC
⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Cpp_parallelization_examples149
6 years ago1mitC++
The study & production material for https://www.youtube.com/watch?v=Pc8DfEyAxzg
Grid142
6 months ago45gpl-2.0C++
Data parallel C++ mathematical object library
Cuda_test91
a year ago1C++
CUDA/SIMD/AssemblyLanguage/OpenMP/Eigen's usage
Rv89
10 months ago15otherC++
RV: A Unified Region Vectorizer for LLVM
John Packages69
5 months ago1gpl-2.0Shell
Community packages of John the Ripper, the auditing tool and advanced offline password cracker (Docker images, Windows PortableApp, Mac OS, Flatpak, and Ubuntu SNAP packages)
Shortcut Comparison65
4 years ago1mitRust
Performance comparison of parallel Rust and C++
Parallelreductionsbenchmark58
7 months ago1C++
Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!
Alternatives To Optimizing Dgemm On Intel Cpus With Avx512f
Select To Compare


Alternative Project Comparisons
Popular Simd Projects
Popular Openmp Projects
Popular Hardware Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
C
Simd
Openmp
Blas
Avx512