Awesome Open Source

Programming Languages

Search results for avx avx512

50 search results found

Simdjson ⭐ 18,377

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

Asm Dude ⭐ 4,081

Visual Studio extension for assembly syntax highlighting and code completion in assembly files and the disassembly window

Highway ⭐ 3,041

Performance-portable, length-agnostic SIMD with runtime dispatch

Simde ⭐ 2,054

Implementations of SIMD instruction sets for systems which don't natively support them.

Xsimd ⭐ 2,034

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.

Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

Croaring ⭐ 1,382

Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks

SIMD Vector Classes for C++

Libsimdpp ⭐ 1,064

Portable header-only C++ low level SIMD library

Simdutf ⭐ 868

Unicode routines (UTF8, UTF16, UTF32): billions of characters per second using SSE2, AVX2, NEON, AVX-512. Part of Node.js and Bun.

Sha256 Simd ⭐ 838

Accelerate SHA256 computations in pure Go using AVX512, SHA Extensions for x86 and ARM64 for ARM. On AVX512 it provides an up to 8x improvement (over 3 GB/s per core). SHA Extensions give a performance boost of close to 4x over native.

Libxsmm ⭐ 789

Library for specialized dense and sparse matrix operations, and deep learning primitives.

X86 Simd Sort ⭐ 731

C++ template library for high performance SIMD based sorting algorithms

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT

Simsimd ⭐ 514

Vector Similarity Functions 3x-200x Faster than SciPy and NumPy — for Python, JavaScript, and C 11, supporting f64, f32, f16, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐

Std Simd ⭐ 467

std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX, AVX-512 and SVE (length specific).

A CPU tool for benchmarking the peak of floating points

Sse Popcount ⭐ 297

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

Open Source Architecture Code Analyzer

Turbo Base64 ⭐ 253

Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!

Libpopcnt ⭐ 234

🚀 Fast C/C++ bit population count library

Hybridizer Basic Samples ⭐ 220

Examples of C# code compiled to GPU by hybridizer

Agenium Scale vectorization library for CPUs and GPUs

Corrfunc ⭐ 156

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.

Base64 Avx512 ⭐ 139

Code for paper "Base64 encoding and decoding at almost the speed of a memory copy"

Sse4 Strstr ⭐ 130

SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification

Md5 Simd ⭐ 123

Accelerate aggregated MD5 hashing performance up to 8x for AVX512 and 4x for AVX2. Useful for server applications that need to compute many MD5 sums in parallel.

YASK--Yet Another Stencil Kit: a domain-specific language and framework to create high-performance stencil code for implementing finite-difference methods and similar applications.

Base64simd ⭐ 95

Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)

RV: A Unified Region Vectorizer for LLVM

Unisimd Assembler ⭐ 83

SIMD macro assembler unified for ARM, MIPS, PPC and x86

Fcml Lib ⭐ 81

A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).

Awesome Simd ⭐ 71

A curated list of awesome SIMD frameworks, libraries and software

Libtwiddle ⭐ 57

bit & sketches data structures

UME::SIMD A library for explicit simd vectorization.

The CEX Cryptographic library in C++

Radar Electrooptical Simulation ⭐ 50

(REOS) Radar and Electro-Optical Simulation Framework written in C++.

Cryptogams ⭐ 49

CRYPTOGAMS distribution repository

Argminmax ⭐ 48

Efficient argmin & argmax

Object file converter This utility can be used for converting object files between COFF/PE, OMF, ELF and Mach-O formats for all 32-bit and 64-bit x86 platforms. Can modify symbol names in object files. Can build, modify and convert function libraries across platforms. Can dump object files and executable files. Also includes a very good disassembler supporting the SSE4, AVX, AVX2, AVX512, FMA3, FMA4, XOP and Knights Corner instruction sets.

Radar_electrooptical_simulation ⭐ 44

(REOS) Radar and ElectroOptical Simulation Framework written in Fortran.

Fast C++ function "is_utf8": checks if the input is valid UTF-8. Made of a single source file. Optimized for ARM NEON, x64 SSE, AVX2 and AVX-512.

Positional Popcount ⭐ 40

Fast C functions for the computing the positional popcount (pospopcnt).

Instlatx64_demo ⭐ 39

InstLatX64_Demo

A collection of SIMD (AVX2 & AVX512) accelerated mathematical functions for .NET

Libalgebra ⭐ 28

Fast C header-only library for popcnt, pospopcnt, and set algebraic operations

Ckb Miner ⭐ 26

ckb miner for avx2 cpu, avx512 cpu and GPU

Vpu Count ⭐ 25

Information about AVX-512 support on recent Intel processors

Hypersonic Rle Kit ⭐ 24

The fastest Run-Length-Encoding on the Planet (for x64)

Paddedmatrices.jl ⭐ 24

This library provides arrays with columns padded to be a multiple of SIMD-vector width.

Quadray Engine ⭐ 23

Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

Intel Sde Flops ⭐ 22

Computing FLOPs with Intel Software Development Emulator (Intel SDE)

DR3 enables users to write vectorised code using generic lambdas and filters. Switch instruction set just by changing enclosing namespace

Ternary Logic ⭐ 19

Support for ternary logic in SSE, XOP, AVX2 and x86 programs

Std_find_simd ⭐ 19

std::find simd version

Ultra Sort ⭐ 18

DSL for SIMD Sorting on AVX2 & AVX512

Fastimplementation Bilateralfilter ⭐ 16

Avx Hole ⭐ 15

AVX-Hole C++ SIMD Library

Simd Byte Lookup ⭐ 13

SIMDized check which bytes are in a set

Avx 512 Sort ⭐ 12

Fast AVX512 (AVX-512) quicksort + bitonic sort.

Stormbitmaps ⭐ 11

Fast algorithms for computing XX^T for binary matrices

Parsing Int Series ⭐ 11

Parse multiple decimal integers separated by arbitrary number of delimiters

Optiflop ⭐ 11

Optiflop measures the optimally achievable FLOPs for mathematical operations on various platforms.

Libflagstats ⭐ 11

Efficient C functions to compute the summary statistics (flagstats) for sequencing read sets.

A multi-arch library implementing the Argon2 password hashing algorithm.

Bilinear_filter_simd ⭐ 8

Bilinear image filtering implemented with SSE4, AVX2 and AVX512.

A list of fast libraries, primarily x86/64 C++ and Node.js C++ extensions

5g Simd Ldpc ⭐ 8

SIMD-LDPC based on 5G New Radio supporting SSE, AVX2 and AVX512.

Ternarylogiccli ⭐ 8

CLI utilty to work out proper constants for vpternlogic instruction

C++17 N-body Barnes-Hut on heterogeneous hardware architectures

Avx512test ⭐ 7

Utility that was used to generate initial Go AVX-512 encoder test suite.

Intrinsics Viewer ⭐ 7

x86-64, ARM, and RVV intrinsics viewer

Force2vec ⭐ 6

Implementation of Force2Vec method for ICDM 2020 paper titled "Force2Vec: Parallel force-directed graph embedding"

SYCL accelerated BLAKE3 Hash Implementation

Memcpy_benchmark ⭐ 6

Benchmark to show which is the fastest memcpy.

Vectorizedkernel ⭐ 5

Running GPGPU-like kernels on CPU with auto-vectorization for SSE/AVX/AVX512 SIMD Architectures

Related Searches

C Plus Plus Avx (334)

1-50 of 50 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.