Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for asr
asr
x
542 search results found
Kaldi
⭐
13,453
kaldi-asr/kaldi is the official location of the Kaldi project.
Paddlespeech
⭐
10,011
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Nemo
⭐
9,041
NeMo: a toolkit for conversational AI
Espnet
⭐
7,563
End-to-End Speech Processing Toolkit
Whisperx
⭐
7,510
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Speechbrain
⭐
7,166
A PyTorch-based Speech Toolkit
Vosk Api
⭐
6,633
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Wukong Robot
⭐
5,386
🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目,支持ChatGPT多轮对话能力,还可能是首
Recorder
⭐
4,159
html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式,支持pc和Android、iOS部分浏览器、Hybrid App(提供Android iOS App源码)、微信,提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码
Silero Models
⭐
4,088
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
Wenet
⭐
3,699
Production First and Production Ready End-to-End Speech Recognition Toolkit
Lingvo
⭐
2,776
Lingvo
Pytorch Kaldi
⭐
2,138
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Rhasspy
⭐
2,036
Offline private voice assistant for many human languages
Stt
⭐
1,988
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
Youtube Transcript Api
⭐
1,955
This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
Wer_are_we
⭐
1,734
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
Delta
⭐
1,584
DELTA is a deep learning based natural language and speech processing platform.
Whisper Diarization
⭐
1,538
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
Whisper Asr Webservice
⭐
1,317
OpenAI Whisper ASR Webservice API
Whisper Timestamped
⭐
1,217
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Live Transcribe Speech Engine
⭐
1,176
Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.
Pykaldi
⭐
954
A Python wrapper for Kaldi
Espresso
⭐
930
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Athena
⭐
821
an open-source implementation of sequence-to-sequence based speech processing engine
Conformer
⭐
809
[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
Vosk Server
⭐
802
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Whisper.api
⭐
791
This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.
Sincnet
⭐
764
SincNet is a neural architecture for efficiently processing raw audio samples.
Speech Transformer
⭐
714
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Ppasr
⭐
701
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目
Eesen
⭐
673
The official repository of the Eesen project
Open_stt
⭐
671
Open STT
Awesomekorean_data
⭐
664
한국어 데이터 세트 링크
Openspeech
⭐
653
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Libreasr
⭐
647
💬 An On-Premises, Streaming Speech Recognition System
Cn2an
⭐
589
📦 快速转化「中文数字」和「阿拉伯数字」~ (最新特性:分数,日期、温度等转化)
Chinese_text_normalization
⭐
582
Chinese text normalization for speech processing
Kospeech
⭐
572
Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
Cheetah
⭐
537
On-device streaming speech-to-text engine powered by deep learning
Paddlepaddle Deepspeech
⭐
536
基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows, Jetson开发板预测。
Autosub
⭐
525
A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui
Interspeech 2023 Papers
⭐
513
INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
Whisper Finetune
⭐
502
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
Whisper Standalone Win
⭐
488
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
Neural_sp
⭐
466
End-to-end ASR/LM implementation with PyTorch
Masr
⭐
462
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conforme
Visionworkbench
⭐
459
The NASA Vision Workbench is a general purpose image processing and computer vision library developed by the Autonomous Systems and Robotics (ASR) Area in the Intelligent Systems Division at the NASA Ames Research Center.
Zamia Speech
⭐
413
Open tools and data for cloudless automatic speech recognition
Nmtpytorch
⭐
395
Sequence-to-Sequence Framework in PyTorch
Leopard
⭐
390
On-device speech-to-text engine powered by deep learning
Sherpa
⭐
374
Speech-to-text server framework with next-gen Kaldi
Deepxi
⭐
367
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
Huggingsound
⭐
357
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Leaderboard
⭐
351
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Hms Ml Demo
⭐
333
HMS ML Demo provides an example of integrating Huawei ML Kit service into applications. This example demonstrates how to integrate services provided by ML Kit, such as face detection, text recognition, image segmentation, asr, and tts.
Adhan Js
⭐
327
High precision Islamic prayer time library for JavaScript
Parrots
⭐
318
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音,基于语音库实现,易扩展。
Faster Whisper Gui
⭐
312
faster_whisper GUI with PySide6
Langhelper
⭐
292
Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.
Pyannote Whisper
⭐
290
Rapidasr
⭐
289
商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.
Cat
⭐
288
A CRF-based ASR Toolkit
Tensorflow_end2end_speech_recognition
⭐
275
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Maix Speech
⭐
261
Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.
Klaam
⭐
253
Arabic speech recognition, classification and text-to-speech.
Bigcidian
⭐
248
Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.
Docker Kaldi Gstreamer Server
⭐
246
Dockerfile for kaldi-gstreamer-server.
Kerasdeepspeech
⭐
244
A Keras CTC implementation of Baidu's DeepSpeech for model experimentation
End2end Asr Pytorch
⭐
239
End-to-End Automatic Speech Recognition on PyTorch
Vosk Browser
⭐
238
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
Edgedict
⭐
229
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Speech_dataset
⭐
229
The dataset of Speech Recognition
Dsnote
⭐
225
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Asr_theory
⭐
221
语音识别理论,论文和PPT
Restor
⭐
219
Restor is a user-friendly application to (mass) image macOS computers from a single source
Wav2vec2 Live
⭐
218
A live speech recognition using Facebooks wav2vec 2.0 model.
Whisper.unity
⭐
218
Running speech to text model (whisper.cpp) in Unity3d on your local machine.
Zeroth
⭐
211
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Dacidian
⭐
198
DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)
Ctc Segmentation
⭐
192
Segment an audio file and obtain utterance alignments. (Python package)
Asr Evaluation
⭐
191
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
Asr Audio Data Links
⭐
187
A list of publically available audio data that anyone can download for ASR or other speech activities
Asr33
⭐
184
Using a Teletype Model 33 electromechanical terminal (https://twitch.tv/33asr)
Pychain
⭐
180
PyTorch implementation of LF-MMI for End-to-end ASR
End To End Slu
⭐
175
PyTorch code for end-to-end spoken language understanding (SLU) with ASR-based transfer learning
Interspeech2019 Tutorial
⭐
160
INTERSPEECH 2019 Tutorial Materials
Whisperhallu
⭐
157
Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
Chinese Automatic Speech Recognition
⭐
157
Chinese speech recognition
Speecht
⭐
156
An opensource speech-to-text software written in tensorflow
Py Kaldi Asr
⭐
154
Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.
Sova Asr
⭐
149
SOVA ASR (Automatic Speech Recognition)
Aidget
⭐
146
Ai edge toolbox,专门面向边端设备尤其是嵌入式RTOS平台,AI模型部署工具链,包括模型推理引擎和模型
Freeswitch Asr
⭐
144
FreeSWITCH ASR APP
Mitzuli
⭐
143
The open, easy-to-use and powerful translator app for Android
Pray Times
⭐
143
Welcome to Pray Times, an Islamic project aimed at providing an open-source library for calculating Muslim prayers times.
Icassp 2023 Papers
⭐
139
ICASSP 2023 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023 conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
Speech To Text Russian
⭐
138
Проект для распознавания речи на русском языке на основе pykaldi.
Mrcp Plugin With Freeswitch
⭐
135
使用FreeSWITCH接受用户手机呼叫,通过UniMRCP Server集成讯飞开放平台(xfyun)插件将用户语音进行语音识别(ASR),并根据自定义业务逻辑
Reazonspeech
⭐
134
Construct large-scale Japanese audio corpus at home
Related Searches
Python Asr (347)
Speech Recognition Asr (250)
1-100 of 542 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.