Awesome Open Source

Programming Languages

Search results for python quantization

239 search results found

Chinese Llama Alpaca ⭐ 15,877

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Llama Factory ⭐ 10,715

Easy-to-use LLM fine-tuning framework (LLaMA, BLOOM, Mistral, Baichuan, Qwen, ChatGLM)

Pinto_model_zoo ⭐ 3,121

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.

Pretrained Language Model ⭐ 2,912

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Repvgg ⭐ 2,882

RepVGG: Making VGG-style ConvNets Great Again

Deepsparse ⭐ 2,729

Sparsity-aware deep learning inference runtime for CPUs

Pocketflow ⭐ 2,553

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.

Pytorch Playground ⭐ 2,366

Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)

Mixtral Offloading ⭐ 1,943

Run Mixtral-8x7B models in Colab or consumer desktops

Optimum ⭐ 1,908

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Neural Compressor ⭐ 1,773

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Vector Quantize Pytorch ⭐ 1,627

Vector Quantization, in Pytorch

Paddleslim ⭐ 1,486

PaddleSlim is an open-source library for deep model compression and architecture search.

Model Optimization ⭐ 1,445

A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.

Mmrazor ⭐ 1,231

OpenMMLab Model Compression Toolbox and Benchmark.

Intel Extension For Pytorch ⭐ 1,161

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Brevitas ⭐ 1,015

Brevitas: neural network quantization in PyTorch

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

Neural Network Compression Framework for enhanced OpenVINO™ inference

Deepvac ⭐ 618

PyTorch Project Specification.

Deep Compression Alexnet ⭐ 599

Deep Compression on AlexNet

Kill The Bits ⭐ 582

Code for: "And the bit goes down: Revisiting the quantization of neural networks"

QKeras: a quantization deep learning library for Tensorflow Keras

Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.

Complete Life Cycle Of A Data Science Project ⭐ 499

Complete-Life-Cycle-of-a-Data-Science-Project

Squeezellm ⭐ 486

SqueezeLLM: Dense-and-Sparse Quantization

A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.

Awesome Deep Neural Network Compression ⭐ 475

Summary, Code for Deep Neural Network Quantization

Omniquant ⭐ 464

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Onnx2tf ⭐ 461

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Caffe Int8 Convert Tools ⭐ 455

Generate a quantization parameter file for ncnn framework int8 inference

Onnx2tflite ⭐ 422

Tool for onnx->keras or onnx->tflite. If tool is useful for you, please star it.

Sparsezoo ⭐ 347

Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes

Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.

Ai Research Code ⭐ 319

Sparsify ⭐ 310

ML model optimization product to accelerate inference.

Brocolli ⭐ 303

Everything in Torch Fx

Easyquant ⭐ 295

EasyQuant(EQ) is an efficient and simple post-training quantization method via effectively optimizing the scales of weights and activations.

Pure python implementation of product quantization for nearest neighbor search

Sparsebit ⭐ 291

A model compression and acceleration toolbox based on pytorch.

⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.

Optimum Intel ⭐ 268

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Quantized_distillation ⭐ 266

Implements quantized distillation. Code for our paper "Model compression via distillation and quantization"

Bevformer_tensorrt ⭐ 260

BEVFormer inference on TensorRT, including INT8 Quantization and Custom TensorRT Plugins (float/half/half2/int8).

[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision

Model_optimization ⭐ 245

Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.

Blueoil ⭐ 243

Bring Deep Learning to small devices

PyTorch implementation of Data Free Quantization Through Weight Equalization and Bias Correction.

A c++ toolbox of locality-sensitive hashing (LSH), provides several popular LSH algorithms, also support python and matlab.

Yolo Multi Backbones Attention ⭐ 223

Model Compression—YOLOv3 with multi lightweight backbones(ShuffleNetV2 HuaWei GhostNet), attention, prune and quantization

Nnef Tools ⭐ 216

The NNEF Tools repository contains tools to generate and consume NNEF documents

Model_compression ⭐ 203

PyTorch Model Compression

Awesome Edge Machine Learning ⭐ 200

A curated list of awesome edge machine learning resources, including research papers, inference engines, challenges, books, meetups and others.

Hadamard Matrix For Hashing ⭐ 197

CVPR2020: Central Similarity Quantization/Hashing for Efficient Image and Video Retrieval

Llama.onnx ⭐ 196

LLaMa/RWKV onnx models, quantization and testcase

Tensorflowlite Bin ⭐ 185

Prebuilt binary for TensorFlowLite's standalone installer. For RaspberryPi. A very lightweight installer. I provide a FlexDelegate, MediaPipe Custom OP and XNNPACK enabled binary.

Deep Compression Pytorch ⭐ 182

PyTorch implementation of 'Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding' by Song Han, Huizi Mao, William J. Dally

Pytorch Tools ⭐ 173

Useful PyTorch functions and modules that are not implemented in PyTorch by default

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Compress Fasttext ⭐ 153

Tools for shrinking fastText models (in gensim format)

Terngrad ⭐ 152

Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)

[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer

Official implementation of Half-Quadratic Quantization (HQQ)

Powerful, automated analysis and design of quantum microwave chips & devices [Energy-Participation Ratio and more]

Compact high quality word embeddings for Russian language

Torch Model Compression ⭐ 137

针对pytorch模型的自动化模型结构分析和修改工具集，包含自动分析模型结构的模型压缩算法库

Inq Pytorch ⭐ 131

A PyTorch implementation of "Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights"

Easy Translate ⭐ 128

Easy-Translate is a script for translating large text files with a SINGLE COMMAND. Easy-Translate is designed to be as easy as possible for beginners and as seamlesscustomizable and as possible for advanced users.

Cnn Quantization ⭐ 126

Quantization of Convolutional Neural networks.

Q Diffusion ⭐ 125

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

Nn Compression ⭐ 122

A Pytorch implementation of Neural Network Compression (pruning, deep compression, channel pruning)

Model Quantization ⭐ 116

Collections of model quantization algorithms

Apot_quantization ⭐ 115

PyTorch implementation for the APoT quantization (ICLR 2020)

Takeoff Community ⭐ 109

TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models accessible to everyone.

Tf2deepfloorplan ⭐ 107

TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.

Ternarynet ⭐ 103

Implementation for Trained Ternary Network.

[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework

PB-LLM: Partially Binarized Large Language Models

Samplernn ⭐ 99

Tensorflow implementation of SampleRNN

Graffitist ⭐ 99

Graph Transforms to Quantize and Retrain Deep Neural Nets in TensorFlow.

Nnieqat Pytorch ⭐ 93

A nnie quantization aware training tool on pytorch.

Official implementation of FQ-GAN

A pytorch Quantization Toolkit

Structural Analogy ⭐ 82

Pytorch implementation for the paper "Structural-analogy from a Single Image Pair"

Hailo_model_zoo ⭐ 82

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

Discrete Key Value Bottleneck Pytorch ⭐ 81

Implementation of Discrete Key / Value Bottleneck, in Pytorch

reference pytorch code for named entity tagging

Permute Quantize Finetune ⭐ 79

Using ideas from product quantization for state-of-the-art neural network compression.

Bert Squeeze ⭐ 77

🛠️ Tools for Transformers compression using PyTorch Lightning ⚡

An Open Source Deep Learning Inference Engine Based on FPGA

Model Compression ⭐ 74

This is my final year project of Bachelor of Engineering. Its still incomplete though. I am trying to replicate the research paper "Deep Compression" by Song Han et. al. This paper received best paper award in ICLR 2016

Facial Landmark Detection Hrnet ⭐ 72

A TensorFlow implementation of HRNet for facial landmark detection.

Alibabacloud Quantization Networks ⭐ 72

alibabacloud-quantization-networks

This project is the official implementation of 'Basic Binary Convolution Unit for Binarized Image Restoration Network', ICLR2023

Neuralcompressor ⭐ 66

Embedding Quantization (Compress Word Embeddings)

Ssql Eccv2022 ⭐ 64

PyTorch implementation of SSQL (Accepted to ECCV2022 oral presentation)

Sota Backbones ⭐ 64

A collection of SOTA Image Classification Models in PyTorch

Cvpr17 Dvsq ⭐ 62

The implementation of CVPR-17 paper "Deep Visual-Semantic Quantization of Efficient Image Retrieval"

Related Searches

Python Django (28,897)

Python Flask (17,643)

Python Dataset (14,792)

Python Pytorch (14,670)

Python Machine Learning (14,100)

Python Docker (13,757)

Python Tensorflow (13,739)

Python Command Line (13,351)

Python Deep Learning (13,095)

Python Jupyter Notebook (12,976)

1-100 of 239 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.