The Top 101 Avx Open Source Projects on Github
Categories
>
Software Performance
>
Avx
Simd
⭐
1,389
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512 for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.
Simde
⭐
1,244
Implementations of SIMD instruction sets for systems which don't natively support them.
Xsimd
⭐
1,131
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, NEON, AVX512)
Vc
⭐
1,091
SIMD Vector Classes for C++
Cglm
⭐
1,032
📽 Highly Optimized Graphics Math (glm) for C
Kfr
⭐
1,013
Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)
Directxmath
⭐
965
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
Wheels
⭐
885
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Sha256 Simd
⭐
668
Accelerate SHA256 computations in pure Go using Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.
Libxsmm
⭐
579
Library for specialized dense and sparse matrix operations, and deep learning primitives.
Sleef
⭐
379
SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
Std Simd
⭐
346
std::experimental::simd for GCC [ISO/IEC TS 19570:2018]
Mipp
⭐
298
MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX and AVX-512.
Bitmagic
⭐
295
BitMagic Library
Boost.simd
⭐
239
Boost SIMD
Hlslpp
⭐
201
Math library using hlsl syntax with SSE/NEON support
Ctranslate2
⭐
189
Fast inference engine for Transformer models
Osaca
⭐
182
Open Source Architecture Code Analyzer
Nsimd
⭐
178
Agenium Scale vectorization library for CPUs and GPUs
Hybridizer Basic Samples
⭐
166
Examples of C# code compiled to GPU by hybridizer
Eve
⭐
155
Expressive Velocity Engine - SIMD in C++ Goes Brrrr
Corrfunc
⭐
127
⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
Turbo Base64
⭐
108
Turbo Base64 - Fastest Base64 SIMD/Neon/Altivec
Packettracer
⭐
105
The SIMD-accelereted ray tracing in C# powered by Intel hardware intrinsic of .NET Core.
Despacer
⭐
92
C library to remove white space from strings as fast as possible
Image Processing Algorithm Speed
⭐
83
opencv
Penguinv
⭐
76
Simple and fast C++ image processing library with focus on heterogeneous systems
Dilithium
⭐
70
Fcml Lib
⭐
70
General purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).
Unisimd Assembler
⭐
68
SIMD macro assembler unified for ARM, MIPS, PPC and x86
Engraver
⭐
65
PoCC Burstcoin Reference Plotter
Chromium_clang
⭐
61
Chromium browser compiled with the Clang/LLVM compiler.
Intrinsicsplayground
⭐
59
My toys to play with SSE/AVX in pure C# (.NET Core 2.1)
Umesimd
⭐
56
UME::SIMD A library for explicit simd vectorization.
Asmc
⭐
50
Asmc Macro Assembler
Avx Memmove
⭐
41
Highly optimized versions of memmove, memcpy, memset, and memcmp supporting SSE4.2, AVX, AVX2, and AVX512
Sse Avx Rasterization
⭐
39
Triangle rasterization routines accelerated by SSE and AVX
Hamming_weight
⭐
36
C library to compute the Hamming weight of arrays
Fast Hex
⭐
30
Fast, SIMD hex string encoder and decoder C++ lib and Node.js module
Fast Filters
⭐
30
Implementation of FIR and IIR filters optimized for SIMD processing
Cex
⭐
28
The CEX Cryptographic library in C++
Peakperf
⭐
25
Achieve peak performance on x86 CPUs and NVIDIA GPUs
Cpuwhat
⭐
23
Nim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics
Hpc
⭐
22
Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )
Ternary Logic
⭐
19
Support for ternary logic in SSE, XOP, AVX2 and x86 programs
Corium
⭐
18
Corium is a modern scripting language which combines simple, safe and efficient programming.
Go Memset
⭐
18
An efficient memset implementation for Golang.
Quadray Engine
⭐
18
Realtime raytracer using SIMD on ARM, MIPS, PPC and x86
Opal
⭐
17
SIMD C/C++ library for massive optimal sequence alignment (local/SW, infix, overlap, global)
Sol
⭐
14
[C99] A fast vector library with Nim bindings.
Oversimple
⭐
13
A library for audio oversampling, which tries to offer a simple api while wrapping HIIR, by Laurent De Soras, for minimum phase antialiasing, and r8brain-free-src, by Aleksey Vaneev, for linear phase antialiasing.
Latte
⭐
12
Latte is a convolutional neural network (CNN) inference engine written in C++ and uses AVX to vectorize operations. The engine runs on Windows 10, Linux and macOS Sierra.
Fastcode
⭐
8
A list of fast libraries, primarily x86/64 C++ and Node.js C++ extensions
Stkfmm
⭐
8
A C++ library for various Laplace/Stokes kernels
Optiflop
⭐
7
Optiflop measures the optimally achievable FLOPs for mathematical operations on various platforms.
Hexhamming
⭐
7
➗ SIMD-accelerated bitwise hamming distance Python module for hexidecimal strings
Rakau
⭐
7
C++17 N-body Barnes-Hut on heterogeneous hardware architectures
Simd_utils
⭐
7
A header only library implementing common mathematical functions using SIMD intrinsics
Litesimd
⭐
7
Litesimd is a no overhead, header only, C++ library for SIMD processing, specialized on SIMD comparison and data shuffle.
Simd Sse Avx Neon
⭐
6
Introduction about SIMD instructions. Mainly about SSE and AVX.
Balisc
⭐
6
A fresh (experimental) look at Scilab 6.x
Memcpy_benchmark
⭐
6
Benchmark to show which is the fastest memcpy.
Hiir
⭐
6
A header only ready to include mirror of the HIIR library by Laurent De Soras, an oversampling and Hilbert transform library in C++, with additional support for double precision on ARM AArch64 using Neon.
Minijson
⭐
5
Minify JSON files fast! Supports Comments. Uses D, C, and AVX2 and SSE4_1 SIMD.
Ugemm
⭐
5
GEMM
Simdx
⭐
4
🎹 Unified implementation of SIMD intrinsic functions and a fallback on hardware which doesn't natively support them.
2d Image Convolution Mpi Simd
⭐
3
2D Image Convolution in C using MPI and SIMD extensions.
Fastconv
⭐
3
fast 2D convolution implementation benchmark
Fast Bernoulli
⭐
3
Fast generation of long sequencies of bernoulli-distributed random variables
Tensorflow No Avx
⭐
3
TensorFlow compiled on CPU without AVX
Simd_neuralnet
⭐
2
Feed-forward neural network implementation in C with SIMD instructions
Vector
⭐
2
📐 Defines the properties of space, displacements, euclidean vector, vector algebra and Vector(2|3|4) of known cardinality
Math
⭐
2
Vector math library
Fast Utf8 Methods
⭐
2
Fast UTF-8 utility methods
Bilinear_filter_simd
⭐
2
Bilinear image filtering implemented with SSE4, AVX2 and AVX512.
Rle8
⭐
2
The fastest decoding Run-Length-Encoding on the Planet (for x64)
Avec
⭐
2
A little library for using SIMD instructions for x86 and ARM, wrapping Agner Fog's vectorclass for x86 and filling some of its functionality for ARM, and providing containers for aligned memory with views and interleaving/deinterleaving.
Mokka
⭐
2
Mokka is a minimal Inference Engine for Dense and Convolutional 2D Layer Neural Networks. Written on a single C++ header, it uses AVX2
Hpc On Matrix
⭐
2
Implementing High Performance Computation on General Mathematics Operation
Intel Simd
⭐
1
⚡ Leverage Intel vectorization technique MMX, SSE2 and AVX to accelerate the processing of converting YUV420 image into RGB image.
Smoljson
⭐
1
Blazing fast and light SIMD JSON parser in a few hundreds lines of C Code
Determinante Com Avx
⭐
1
Calcula o determinante de uma matriz 4x4 com instruções AVX.
Fingera
⭐
1
Study Assembly X64
⭐
1
Projects and annotations used to learn x64 assembly
Mdb
⭐
1
Framework for making computation on CPU
Proxasm
⭐
1
A C/x86 assembly implementation of proximal operators with SSE3/AVX SIMD instructions
Lattice
⭐
1
Vectorized primitives on Intel AVX/AVX2 for some Ring-LWE problems
Yuvconvert
⭐
1
library for optimized rgb to/from yuv convertions.
Pxart
⭐
1
pXart: Packed Extensions for Advanced Random Techniques: C++ Library and Applications for Random Number Generators
Pffft Double
⭐
1
A fork of Julien Pommier's PFFFT - a pretty fast FFT - that adds support for double precision floating-point numbers using AVX instructions.
Floating
⭐
1
Basic digital paint program using C, xcb and libtiff (to save the images) supporting a Wacom tablet and stylus in addition to a mouse
Cse260 Matmul Avx
⭐
1
Portable_simd
⭐
1
testing an SIMD api from VecCore VecGeom, using backends of UMESIMD, VC for Avx Avx2,AVX512, SSE, SSE2
Poly1305
⭐
1
An AVX/AVX2/x64 implementation of the Poly1305 MAC for Golang. [Deprecated].
Vectorized_types
⭐
1
A small C++ library for easy explicit compile time vectorization. WIP.
Matrix Matrix Multiply
⭐
1
Algorithms for matrix matrix multiplication, dgemm, AVX-256, AVX-512
Iris
⭐
1
Software implementation of ARM and x86 SIMD intrinsics
Landofinstructionsets
⭐
1
Here, I mainly work on a console frontend known an SDECF (SDE Console Frontend) for Intel's instruction set emulator, SDE, which is found up-to-date at Intel's dev/software main page at https://software.intel.com/content/www/us/en/develop/articles/intel-software-development-emulator.html if it's not currently up to date in my release builds. More projects or repos may come soon...
Matrix Multiplication Simd Intrinsics And Fpu
⭐
1
NxN Matrix Multiplication using SIMD with Intrinsics (MMX, SSE, SSE2, AVX, etc.) and FPU as inline ASM in C
Streamvbyte Simdgo
⭐
0
A Stream VByte implementation in Go leveraging SIMD techniques
1-100 of 101 projects
Next >
Related Projects
Simd Avx Projects (62)
C Plus Plus Avx Projects (47)
Sse Avx Projects (47)
Avx2 Avx Projects (44)
Simd Sse Avx Projects (37)
Avx Avx512 Projects (35)
