Awesome Open Source

Programming Languages

Search results for sse avx

130 search results found

Tensorflow Windows Wheel ⭐ 3,522

Tensorflow prebuilt binary for Windows

Embree ⭐ 2,201

Embree ray tracing kernels repository.

Simde ⭐ 2,054

Implementations of SIMD instruction sets for systems which don't natively support them.

Xsimd ⭐ 2,034

C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))

📽 Highly Optimized Graphics Math (glm) for C

C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM.

Fast, modern C++ DSP framework, FFT, Sample Rate Conversion, FIR/IIR/Biquad Filters (SSE, AVX, AVX-512, ARM NEON)

Directxmath ⭐ 1,482

DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps

SIMD Vector Classes for C++

Libsimdpp ⭐ 1,064

Portable header-only C++ low level SIMD library

Include binary files in C/C++

Fast Base64 stream encoder/decoder in C99, with SIMD acceleration

Libxsmm ⭐ 789

Library for specialized dense and sparse matrix operations, and deep learning primitives.

Expressive Vector Engine - SIMD in C++ Goes Brrrr

Fastnoisesimd ⭐ 604

C++ SIMD Noise Library

BLAKE2 official implementations

FlyCV is a high-performance library for processing computer visual tasks.

Std Simd ⭐ 467

std::experimental::simd for GCC [ISO/IEC TS 19570:2018]

MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX, AVX-512 and SVE (length specific).

Math library using hlsl syntax with SSE/NEON support

A CPU tool for benchmarking the peak of floating points

Beaengine ⭐ 330

BeaEngine disasm project

Standard Raxml ⭐ 298

Sse Popcount ⭐ 297

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

Turbo Run Length Encoding ⭐ 275

TurboRLE-Fastest Run Length Encoding

Turbo Base64 ⭐ 253

Turbo Base64 - Fastest Base64 SIMD:SSE/AVX2/AVX512/Neon/Altivec - Faster than memcpy!

Libfmsynth ⭐ 251

A C library which implements an FM synthesizer

Blake2b Simd ⭐ 230

Fast hashing using pure Go implementation of BLAKE2b with SIMD instructions

Tensorflow Build ⭐ 225

TensorFlow binaries supporting AVX, FMA, SSE

fast log and exp functions for x86/x64 SSE

Faster Utf8 Validator ⭐ 185

A very fast library for validating UTF-8 using AVX2/SSE4 instructions

Fastgltf ⭐ 182

A modern C++17 glTF 2.0 library focused on speed, correctness, and usability

Despacer ⭐ 141

C library to remove white space from strings as fast as possible

Sse4 Strstr ⭐ 130

SIMD (SWAR/SSE/SSE4/AVX2/AVX512F/ARM Neon) of Karp-Rabin algorithm's modification

Chromium_clang ⭐ 129

Chromium browser compiled with the Clang/LLVM compiler.

Tensorflow Wheels ⭐ 123

Tensorflow Wheels

Penguinv ⭐ 117

Computer vision library with focus on heterogeneous systems

Tensorflow Optimized Wheels ⭐ 116

TensorFlow wheels built for latest CUDA/CuDNN and enabled performance flags: SSE, AVX, FMA; XLA

Xilinx Tiny Cnn ⭐ 115

Magnum Singles ⭐ 103

Single-header libraries from the Magnum engine

A C# SIMD math library for use with Unity only, substantially extending Unity.Mathematics by new types and functions, using Unity.Burst.

Base64simd ⭐ 95

Base64 coding and decoding with SIMD instructions (SSE/AVX2/AVX512F/AVX512BW/AVX512VBMI/ARM Neon)

Basicbitmap ⭐ 92

Simple and high-performance and platform independent Bitmap class (34% faster than GDI/GDI+, 40% faster than DDraw)

Roaring Node ⭐ 89

Roaring for NodeJS

C++ library for detecting CPU capabilities

Go Cv Simd ⭐ 87

Low level image processing library in pure Go with SIMD assembly

Embree Renderer ⭐ 86

Embree Example Renderer

NTT-based Fast Lattice library

Unisimd Assembler ⭐ 83

SIMD macro assembler unified for ARM, MIPS, PPC and x86

Image Processing Algorithm Speed ⭐ 83

Fcml Lib ⭐ 81

A general purpose machine code manipulation library for x86-32 (IA-32) and x86-64 (AMD64) architectures (Assembler, Disassembler, Library).

Mandelbrotsse ⭐ 79

Real-time Mandelbrot zoom via SSE, AVX, OpenMP, CUDA, XaoS...

Normaldist Benchmark ⭐ 78

Normally Distributed Random Number Generator Benchmark

Cppspmd_fast ⭐ 77

Optimized CppSPMD test project: macro control flow, SSE4.1/AVX1/AVX2/AVX2 FMA support

Sliceslice Rs ⭐ 75

A fast implementation of single-pattern substring search using SIMD acceleration.

Triple_accel ⭐ 73

Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.

Optimized functions for Go using SIMD

Simd_utils ⭐ 65

A header only library implementing common mathematical functions using SIMD intrinsics

JIT Assembler Library for multiple ISAs

Fastest CPU (AVX/SSE) RGB to grayscale: 2-4x faster than OpenCV. For image processing/computer vision.

Computer Vision package in pure Go taking advantage of SIMD acceleration

Intrinsicsplayground ⭐ 59

My toys to play with SSE/AVX in pure C# (.NET Core 2.1)

Implementation of argon2 (i, d, id) algorithms with CPU dispatching

PeTar is a high-performance N-body code for modelling the evolution of star clusters and tidal streams, including the effect of galactic potential, dynamics of binary and hierarchical system, single and binary stellar evolution.

Cute Nucleotides ⭐ 53

Cute tricks for SIMD vectorized binary encoding and decoding of nucleotides, in Rust.

a cpp Deep Learning Framework for Beginners to READ and LEARN

Pleasant Nim bindings for SIMD instruction sets.

Argminmax ⭐ 48

Efficient argmin & argmax

Base64 encoding / decoding with SIMD-support, also base64Url

Learning and practice of high performance computing (CUDA, Vulkan, OpenCL, OpenMP, TBB, SSE/AVX, NEON, MPI, coroutines, etc. )

The fastest way to xor bytes in Go

Chacha Opt ⭐ 42

Optimized block functions for the ChaCha stream cipher

Turbo Histogram ⭐ 42

Fastest Histogram Construction

Optimized Recursive Bilateral Filter

Intrinsic ⭐ 41

Provide Golang native SIMD intrinsics on x86/amd64 platform

Positional Popcount ⭐ 40

Fast C functions for the computing the positional popcount (pospopcnt).

X86intrin ⭐ 40

x86 intrinsics for rust

Poly1305 Opt ⭐ 38

Optimized implementations of Poly1305, a fast message-authentication-code

Turbo Transpose ⭐ 38

Transpose: SIMD Integer+Floating Point Compression Filter

Hamming_weight ⭐ 36

C library to compute the Hamming weight of arrays

Lms Intrinsics ⭐ 32

A package that enables the use of SIMD x86 instructions in the Lightweight Modular Staging Framework (LMS).

Memory Bandwidth Demo ⭐ 32

An attempt at achieving the theoretical best memory bandwidth of my machine.

Fast Filters ⭐ 30

Implementation of FIR and IIR filters optimized for SIMD processing

Nbody6ppgpu ⭐ 30

FFT (Fast Fourier Transform): SSE, AVX, AVX2

Digitviewer ⭐ 28

y-cruncher's Digit Viewer

Improved Linux BURST ploter/optimizer/miner

High-performance hex encoding and decoding for .NET

SIMD C/C++ library for massive optimal sequence alignment (local/SW, infix, overlap, global)

Nim utilities for advanced CPU operations: CPU identification, ISA extension detection, bindings to assorted intrinsics

Quadray Engine ⭐ 23

Realtime raytracer using SIMD on ARM, MIPS, PPC and x86

Cpufeature ⭐ 21

Python module for detection of CPU features

An easy way to run BioNano genomic analysis

JSONPath Streaming with Bit-Parallel Fast-Forwarding

Ternary Logic ⭐ 19

Support for ternary logic in SSE, XOP, AVX2 and x86 programs

Go Memset ⭐ 18

An efficient memset implementation for Golang.

Intelintrinsics ⭐ 17

Burstsoftware ⭐ 17

Oversimple ⭐ 16

A library for audio oversampling, which tries to offer a simple api while wrapping HIIR, by Laurent De Soras, for minimum phase antialiasing, and r8brain-free-src, by Aleksey Vaneev, for linear phase antialiasing.

Related Searches

C Plus Plus Avx (294)

C Plus Plus Sse (271)

1-100 of 130 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.