Awesome Open Source

Programming Languages

Search results for speech recognition asr

speech-recognition x

209 search results found

Kaldi ⭐ 13,453

kaldi-asr/kaldi is the official location of the Kaldi project.

Paddlespeech ⭐ 10,011

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

NeMo: a toolkit for conversational AI

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Whisperx ⭐ 7,510

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Vosk Api ⭐ 6,633

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Silero Models ⭐ 4,088

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Wenet ⭐ 3,694

Production First and Production Ready End-to-End Speech Recognition Toolkit

Lingvo ⭐ 2,776

Pytorch Kaldi ⭐ 2,138

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Rhasspy ⭐ 2,036

Offline private voice assistant for many human languages

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Wer_are_we ⭐ 1,734

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

Delta ⭐ 1,584

DELTA is a deep learning based natural language and speech processing platform.

Whisper Diarization ⭐ 1,538

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Whisper Asr Webservice ⭐ 1,317

OpenAI Whisper ASR Webservice API

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Pykaldi ⭐ 954

A Python wrapper for Kaldi

Espresso ⭐ 930

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

an open-source implementation of sequence-to-sequence based speech processing engine

Conformer ⭐ 809

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Vosk Server ⭐ 802

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Sincnet ⭐ 764

SincNet is a neural architecture for efficiently processing raw audio samples.

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目

The official repository of the Eesen project

Openspeech ⭐ 653

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Libreasr ⭐ 647

💬 An On-Premises, Streaming Speech Recognition System

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Chinese_text_normalization ⭐ 578

Chinese text normalization for speech processing

Kospeech ⭐ 572

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Cheetah ⭐ 537

On-device streaming speech-to-text engine powered by deep learning

Paddlepaddle Deepspeech ⭐ 536

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows， Jetson开发板预测。

Interspeech 2023 Papers ⭐ 513

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

Whisper Finetune ⭐ 502

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Whisper Standalone Win ⭐ 488

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Neural_sp ⭐ 466

End-to-end ASR/LM implementation with PyTorch

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conforme

Zamia Speech ⭐ 413

Open tools and data for cloudless automatic speech recognition

Nmtpytorch ⭐ 395

Sequence-to-Sequence Framework in PyTorch

Leopard ⭐ 390

On-device speech-to-text engine powered by deep learning

Speech-to-text server framework with next-gen Kaldi

Huggingsound ⭐ 357

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Leaderboard ⭐ 351

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Parrots ⭐ 318

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Langhelper ⭐ 292

Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.

Tensorflow_end2end_speech_recognition ⭐ 275

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Bigcidian ⭐ 248

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Kerasdeepspeech ⭐ 244

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

End2end Asr Pytorch ⭐ 239

End-to-End Automatic Speech Recognition on PyTorch

Vosk Browser ⭐ 238

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Edgedict ⭐ 229

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Whisper.unity ⭐ 218

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

Wav2vec2 Live ⭐ 218

A live speech recognition using Facebooks wav2vec 2.0 model.

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Asr Evaluation ⭐ 191

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Asr Audio Data Links ⭐ 187

A list of publically available audio data that anyone can download for ASR or other speech activities

Interspeech2019 Tutorial ⭐ 160

INTERSPEECH 2019 Tutorial Materials

Chinese Automatic Speech Recognition ⭐ 157

Chinese speech recognition

Py Kaldi Asr ⭐ 154

Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.

Sova Asr ⭐ 149

SOVA ASR (Automatic Speech Recognition)

Icassp 2023 Papers ⭐ 139

ICASSP 2023 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023 conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Speech To Text Russian ⭐ 138

Проект для распознавания речи на русском языке на основе pykaldi.

Rustfst ⭐ 134

Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.

Keras Kaldi ⭐ 124

Keras Interface for Kaldi ASR

Spokestack Python ⭐ 124

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

Elevateaijavasdk ⭐ 121

Java SDK for ElevateAI

Cv Dataset ⭐ 120

Metadata and versioning details for the Common Voice dataset

Elevateaidotnetsdk ⭐ 115

.Net core 6 SDK for ElevateAI

Elevateaipythonsdk ⭐ 111

ElevateAI - Speech-to-text API Python SDK

Deepgram Python Sdk ⭐ 110

Official Python SDK for Deepgram's automated speech recognition APIs.

Obsidian Transcription ⭐ 107

Obsidian plugin to create high-quality transcriptions from markdown linked audio files

Sepia Stt Server ⭐ 105

SEPIA server to support open-source speech recognition via WebSocket connection.

Las_mandarin_pytorch ⭐ 104

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

Rnn Transducer ⭐ 100

MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks

Pytorch Asr ⭐ 100

ASR with PyTorch

Awesome Russian Speech ⭐ 97

Russian speech technology links

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

Whisper Auto Transcribe ⭐ 91

Auto transcribe tool based on whisper

Mongolian Speech Recognition ⭐ 86

Mongolian speech recognition with PyTorch

Deepgram Js Sdk ⭐ 81

Official JavaScript SDK for Deepgram's automated speech recognition APIs.

PyTorch Implementations for End-to-End Automatic Speech Recognition

Kaldi Serve ⭐ 79

Server framework for Kaldi ASR Toolkit

Asr Wav2vec Finetune ⭐ 76

⚡ Finetune Wa2vec 2.0 For Speech Recognition

Tools for ASR Corpus Generation from Online Video

Indian Accent Speech Recognition ⭐ 73

Traditional ASR (Signal & Cepstral Analysis, DTW, HMM) & DNNs (Custom Models + DeepSpeech) on Indian Accent Speech

Ktspeechcrawler ⭐ 73

Automatically constructing corpus for automatic speech recognition from YouTube videos

Wav2letter ⭐ 70

Speech Recognition model based off of FAIR research paper built using Pytorch.

Time delay neural network (TDNN) implementation in Pytorch using unfold method

Vakyansh Wav2vec2 Experimentation ⭐ 67

Repository containing experimentation platform on how to train, infer on wav2vec2 models.

Viet Asr ⭐ 65

VietASR - Vietnamese Automatic Speech Recognition

Syn Speech ⭐ 62

Syn.Speech is a flexible speaker independent continuous speech recognition engine for Mono and .NET framework

Aaltoasr ⭐ 61

Aalto Automatic Speech Recognition tools

Asr_benchmark ⭐ 60

Program to benchmark various speech recognition APIs

Squeezeformer ⭐ 60

PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)

Avsr Tf1 ⭐ 59

Audio-Visual Speech Recognition using Sequence to Sequence Models

Transfusion Asr ⭐ 59

Transcribing Speech with Multinomial Diffusion, training code and models.

Related Searches

Python Speech Recognition (876)

Python Asr (347)

1-100 of 209 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.