Llama.onnx

LLaMa/RWKV onnx models, quantization and testcase
Alternatives To Llama.onnx
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Distiller4,252
a year ago65apache-2.0Jupyter Notebook
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
Pinto_model_zoo3,121
3 months ago11mitPython
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8), EdgeTPU, CoreML.
Deepsparse2,72933 months ago141December 07, 202328otherPython
Sparsity-aware deep learning inference runtime for CPUs
Micronet2,177
3 years ago46October 06, 202170mitPython
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Optimum1,908533 months ago53December 06, 2023295apache-2.0Python
🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
Ppq957
10 months ago9apache-2.0Python
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Nncf72563 months ago16November 16, 202346apache-2.0Python
Neural Network Compression Framework for enhanced OpenVINO™ inference
Deepvac618
3 years ago59June 28, 202112gpl-3.0Python
PyTorch Project Specification.
Onnx2tf46123 months ago438December 10, 202314mitPython
Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.
Onnx2tflite422
6 months ago10apache-2.0Python
Tool for onnx->keras or onnx->tflite. If tool is useful for you, please star it.
Alternatives To Llama.onnx
Select To Compare


Alternative Project Comparisons
Popular Quantization Projects
Popular Onnx Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Onnx
Quantization