Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for speech recognition asr
asr
x
speech-recognition
x
209 search results found
Kaldi
⭐
13,453
kaldi-asr/kaldi is the official location of the Kaldi project.
Paddlespeech
⭐
10,011
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Nemo
⭐
9,041
NeMo: a toolkit for conversational AI
Espnet
⭐
7,563
End-to-End Speech Processing Toolkit
Whisperx
⭐
7,510
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Speechbrain
⭐
7,166
A PyTorch-based Speech Toolkit
Vosk Api
⭐
6,633
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Silero Models
⭐
4,088
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Wenet
⭐
3,694
Production First and Production Ready End-to-End Speech Recognition Toolkit
Lingvo
⭐
2,776
Lingvo
Pytorch Kaldi
⭐
2,138
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Rhasspy
⭐
2,036
Offline private voice assistant for many human languages
Stt
⭐
1,988
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Wer_are_we
⭐
1,734
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
Delta
⭐
1,584
DELTA is a deep learning based natural language and speech processing platform.
Whisper Diarization
⭐
1,538
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Whisper Asr Webservice
⭐
1,317
OpenAI Whisper ASR Webservice API
Whisper Timestamped
⭐
1,217
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Pykaldi
⭐
954
A Python wrapper for Kaldi
Espresso
⭐
930
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Athena
⭐
821
an open-source implementation of sequence-to-sequence based speech processing engine
Conformer
⭐
809
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Vosk Server
⭐
802
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Sincnet
⭐
764
SincNet is a neural architecture for efficiently processing raw audio samples.
Ppasr
⭐
701
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目
Eesen
⭐
673
The official repository of the Eesen project
Openspeech
⭐
653
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Libreasr
⭐
647
💬 An On-Premises, Streaming Speech Recognition System
Cn2an
⭐
589
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
Chinese_text_normalization
⭐
578
Chinese text normalization for speech processing
Kospeech
⭐
572
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
Cheetah
⭐
537
On-device streaming speech-to-text engine powered by deep learning
Paddlepaddle Deepspeech
⭐
536
基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows, Jetson开发板预测。
Interspeech 2023 Papers
⭐
513
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
Whisper Finetune
⭐
502
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Whisper Standalone Win
⭐
488
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
Neural_sp
⭐
466
End-to-end ASR/LM implementation with PyTorch
Masr
⭐
462
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conforme
Zamia Speech
⭐
413
Open tools and data for cloudless automatic speech recognition
Nmtpytorch
⭐
395
Sequence-to-Sequence Framework in PyTorch
Leopard
⭐
390
On-device speech-to-text engine powered by deep learning
Sherpa
⭐
374
Speech-to-text server framework with next-gen Kaldi
Huggingsound
⭐
357
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Leaderboard
⭐
351
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Parrots
⭐
318
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音,基于语音库实现,易扩展。
Langhelper
⭐
292
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
Tensorflow_end2end_speech_recognition
⭐
275
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Bigcidian
⭐
248
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
Kerasdeepspeech
⭐
244
A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
End2end Asr Pytorch
⭐
239
End-to-End Automatic Speech Recognition on PyTorch
Vosk Browser
⭐
238
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
Speech_dataset
⭐
229
The dataset of Speech Recognition
Edgedict
⭐
229
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Dsnote
⭐
225
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Whisper.unity
⭐
218
Running speech to text model (whisper.cpp) in Unity3d on your local machine.
Wav2vec2 Live
⭐
218
A live speech recognition using Facebooks wav2vec 2.0 model.
Zeroth
⭐
211
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Asr Evaluation
⭐
191
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
Asr Audio Data Links
⭐
187
A list of publically available audio data that anyone can download for ASR or other speech activities
Interspeech2019 Tutorial
⭐
160
INTERSPEECH 2019 Tutorial Materials
Chinese Automatic Speech Recognition
⭐
157
Chinese speech recognition
Py Kaldi Asr
⭐
154
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
Sova Asr
⭐
149
SOVA ASR (Automatic Speech Recognition)
Icassp 2023 Papers
⭐
139
ICASSP 2023 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023 conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
Speech To Text Russian
⭐
138
Проект для распознавания речи на русском языке на основе pykaldi.
Rustfst
⭐
134
Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Keras Kaldi
⭐
124
Keras Interface for Kaldi ASR
Spokestack Python
⭐
124
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.
At16k
⭐
123
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
Elevateaijavasdk
⭐
121
Java SDK for ElevateAI
Cv Dataset
⭐
120
Metadata and versioning details for the Common Voice dataset
Elevateaidotnetsdk
⭐
115
.Net core 6 SDK for ElevateAI
Elevateaipythonsdk
⭐
111
ElevateAI - Speech-to-text API Python SDK
Deepgram Python Sdk
⭐
110
Official Python SDK for Deepgram's automated speech recognition APIs.
Obsidian Transcription
⭐
107
Obsidian plugin to create high-quality transcriptions from markdown linked audio files
Sepia Stt Server
⭐
105
SEPIA server to support open-source speech recognition via WebSocket connection.
Las_mandarin_pytorch
⭐
104
Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)
Rnn Transducer
⭐
100
MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks
Pytorch Asr
⭐
100
ASR with PyTorch
Awesome Russian Speech
⭐
97
Russian speech technology links
Ctc Asr
⭐
92
End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
Whisper Auto Transcribe
⭐
91
Auto transcribe tool based on whisper
Mongolian Speech Recognition
⭐
86
Mongolian speech recognition with PyTorch
Deepgram Js Sdk
⭐
81
Official JavaScript SDK for Deepgram's automated speech recognition APIs.
E2e Asr
⭐
79
PyTorch Implementations for End-to-End Automatic Speech Recognition
Kaldi Serve
⭐
79
Server framework for Kaldi ASR Toolkit
Asr Wav2vec Finetune
⭐
76
⚡ Finetune Wa2vec 2.0 For Speech Recognition
Pansori
⭐
74
Tools for ASR Corpus Generation from Online Video
Indian Accent Speech Recognition
⭐
73
Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models + DeepSpeech) on Indian Accent Speech
Ktspeechcrawler
⭐
73
Automatically constructing corpus for automatic speech recognition from YouTube videos
Wav2letter
⭐
70
Speech Recognition model based off of FAIR research paper built using Pytorch.
Tdnn
⭐
70
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Vakyansh Wav2vec2 Experimentation
⭐
67
Repository containing experimentation platform on how to train, infer on wav2vec2 models.
Viet Asr
⭐
65
VietASR - Vietnamese Automatic Speech Recognition
Syn Speech
⭐
62
Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework
Aaltoasr
⭐
61
Aalto Automatic Speech Recognition tools
Asr_benchmark
⭐
60
Program to benchmark various speech recognition APIs
Squeezeformer
⭐
60
PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)
Avsr Tf1
⭐
59
Audio-Visual Speech Recognition using Sequence to Sequence Models
Transfusion Asr
⭐
59
Transcribing Speech with Multinomial Diffusion, training code and models.
Related Searches
Python Speech Recognition (876)
Python Asr (347)
1-100 of 209 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.