Awesome Open Source

Programming Languages

Search results for asr

542 search results found

Kaldi ⭐ 13,453

kaldi-asr/kaldi is the official location of the Kaldi project.

Paddlespeech ⭐ 10,011

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

NeMo: a toolkit for conversational AI

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Whisperx ⭐ 7,510

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Vosk Api ⭐ 6,633

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Wukong Robot ⭐ 5,386

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首

Recorder ⭐ 4,159

html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式，支持pc和Android、iOS部分浏览器、Hybrid App（提供Android iOS App源码）、微信，提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

Silero Models ⭐ 4,088

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Wenet ⭐ 3,699

Production First and Production Ready End-to-End Speech Recognition Toolkit

Lingvo ⭐ 2,776

Pytorch Kaldi ⭐ 2,138

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Rhasspy ⭐ 2,036

Offline private voice assistant for many human languages

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Youtube Transcript Api ⭐ 1,955

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

Wer_are_we ⭐ 1,734

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

Delta ⭐ 1,584

DELTA is a deep learning based natural language and speech processing platform.

Whisper Diarization ⭐ 1,538

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Whisper Asr Webservice ⭐ 1,317

OpenAI Whisper ASR Webservice API

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Live Transcribe Speech Engine ⭐ 1,176

Live Transcribe is an Android application that provides real-time captioning for people who are deaf or hard of hearing. This repository contains the Android client libraries for communicating with Google's Cloud Speech API that are used in Live Transcribe.

Pykaldi ⭐ 954

A Python wrapper for Kaldi

Espresso ⭐ 930

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

an open-source implementation of sequence-to-sequence based speech processing engine

Conformer ⭐ 809

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Vosk Server ⭐ 802

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Whisper.api ⭐ 791

This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.

Sincnet ⭐ 764

SincNet is a neural architecture for efficiently processing raw audio samples.

Speech Transformer ⭐ 714

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目

The official repository of the Eesen project

Open_stt ⭐ 671

Awesomekorean_data ⭐ 664

한국어 데이터 세트 링크

Openspeech ⭐ 653

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Libreasr ⭐ 647

💬 An On-Premises, Streaming Speech Recognition System

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Chinese_text_normalization ⭐ 582

Chinese text normalization for speech processing

Kospeech ⭐ 572

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Cheetah ⭐ 537

On-device streaming speech-to-text engine powered by deep learning

Paddlepaddle Deepspeech ⭐ 536

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows， Jetson开发板预测。

Autosub ⭐ 525

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui

Interspeech 2023 Papers ⭐ 513

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

Whisper Finetune ⭐ 502

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Whisper Standalone Win ⭐ 488

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Neural_sp ⭐ 466

End-to-end ASR/LM implementation with PyTorch

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conforme

Visionworkbench ⭐ 459

The NASA Vision Workbench is a general purpose image processing and computer vision library developed by the Autonomous Systems and Robotics (ASR) Area in the Intelligent Systems Division at the NASA Ames Research Center.

Zamia Speech ⭐ 413

Open tools and data for cloudless automatic speech recognition

Nmtpytorch ⭐ 395

Sequence-to-Sequence Framework in PyTorch

Leopard ⭐ 390

On-device speech-to-text engine powered by deep learning

Speech-to-text server framework with next-gen Kaldi

Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.

Huggingsound ⭐ 357

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Leaderboard ⭐ 351

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Hms Ml Demo ⭐ 333

HMS ML Demo provides an example of integrating Huawei ML Kit service into applications. This example demonstrates how to integrate services provided by ML Kit, such as face detection, text recognition, image segmentation, asr, and tts.

Adhan Js ⭐ 327

High precision Islamic prayer time library for JavaScript

Parrots ⭐ 318

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Faster Whisper Gui ⭐ 312

faster_whisper GUI with PySide6

Langhelper ⭐ 292

Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.

Pyannote Whisper ⭐ 290

Rapidasr ⭐ 289

商用级开源语音自动识别程序库，开箱即用，全平台支持，中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.

A CRF-based ASR Toolkit

Tensorflow_end2end_speech_recognition ⭐ 275

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Maix Speech ⭐ 261

Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.

Arabic speech recognition, classification and text-to-speech.

Bigcidian ⭐ 248

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Docker Kaldi Gstreamer Server ⭐ 246

Dockerfile for kaldi-gstreamer-server.

Kerasdeepspeech ⭐ 244

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

End2end Asr Pytorch ⭐ 239

End-to-End Automatic Speech Recognition on PyTorch

Vosk Browser ⭐ 238

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Edgedict ⭐ 229

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Asr_theory ⭐ 221

语音识别理论，论文和PPT

Restor is a user-friendly application to (mass) image macOS computers from a single source

Wav2vec2 Live ⭐ 218

A live speech recognition using Facebooks wav2vec 2.0 model.

Whisper.unity ⭐ 218

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Dacidian ⭐ 198

DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)

Ctc Segmentation ⭐ 192

Segment an audio file and obtain utterance alignments. (Python package)

Asr Evaluation ⭐ 191

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Asr Audio Data Links ⭐ 187

A list of publically available audio data that anyone can download for ASR or other speech activities

Using a Teletype Model 33 electromechanical terminal (https://twitch.tv/33asr)

Pychain ⭐ 180

PyTorch implementation of LF-MMI for End-to-end ASR

End To End Slu ⭐ 175

PyTorch code for end-to-end spoken language understanding (SLU) with ASR-based transfer learning

Interspeech2019 Tutorial ⭐ 160

INTERSPEECH 2019 Tutorial Materials

Whisperhallu ⭐ 157

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

Chinese Automatic Speech Recognition ⭐ 157

Chinese speech recognition

Speecht ⭐ 156

An opensource speech-to-text software written in tensorflow

Py Kaldi Asr ⭐ 154

Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.

Sova Asr ⭐ 149

SOVA ASR (Automatic Speech Recognition)

Ai edge toolbox，专门面向边端设备尤其是嵌入式RTOS平台，AI模型部署工具链，包括模型推理引擎和模型

Freeswitch Asr ⭐ 144

FreeSWITCH ASR APP

Mitzuli ⭐ 143

The open, easy-to-use and powerful translator app for Android

Pray Times ⭐ 143

Welcome to Pray Times, an Islamic project aimed at providing an open-source library for calculating Muslim prayers times.

Icassp 2023 Papers ⭐ 139

ICASSP 2023 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023 conference. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Speech To Text Russian ⭐ 138

Проект для распознавания речи на русском языке на основе pykaldi.

Mrcp Plugin With Freeswitch ⭐ 135

使用FreeSWITCH接受用户手机呼叫，通过UniMRCP Server集成讯飞开放平台（xfyun）插件将用户语音进行语音识别（ASR），并根据自定义业务逻辑

Reazonspeech ⭐ 134

Construct large-scale Japanese audio corpus at home

Related Searches

Python Asr (347)

Speech Recognition Asr (250)

1-100 of 542 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.