Awesome Open Source

Programming Languages

Search results for python asr

248 search results found

NeMo: a toolkit for conversational AI

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Whisperx ⭐ 7,510

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Vosk Api ⭐ 6,633

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Wukong Robot ⭐ 5,386

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首

Wenet ⭐ 3,512

Production First and Production Ready End-to-End Speech Recognition Toolkit

Lingvo ⭐ 2,776

Pytorch Kaldi ⭐ 2,138

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Youtube Transcript Api ⭐ 1,955

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

Whisper Asr Webservice ⭐ 1,317

OpenAI Whisper ASR Webservice API

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Pykaldi ⭐ 954

A Python wrapper for Kaldi

Espresso ⭐ 930

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Conformer ⭐ 809

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Vosk Server ⭐ 802

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Whisper.api ⭐ 791

This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.

Sincnet ⭐ 764

SincNet is a neural architecture for efficiently processing raw audio samples.

Speech Transformer ⭐ 714

A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目

Open_stt ⭐ 671

Libreasr ⭐ 647

💬 An On-Premises, Streaming Speech Recognition System

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Cheetah ⭐ 537

On-device streaming speech-to-text engine powered by deep learning

Paddlepaddle Deepspeech ⭐ 536

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows， Jetson开发板预测。

Autosub ⭐ 525

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui

Neural_sp ⭐ 466

End-to-end ASR/LM implementation with PyTorch

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conforme

Leopard ⭐ 390

On-device speech-to-text engine powered by deep learning

Speech-to-text server framework with next-gen Kaldi

Huggingsound ⭐ 357

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Leaderboard ⭐ 351

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Parrots ⭐ 318

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Faster Whisper Gui ⭐ 312

faster_whisper GUI with PySide6

Pyannote Whisper ⭐ 290

A CRF-based ASR Toolkit

Tensorflow_end2end_speech_recognition ⭐ 275

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Maix Speech ⭐ 261

Maix Speech AI lib, a fast and small speech lib running on embedded devices, including ASR, chat, TTS etc.

End2end Asr Pytorch ⭐ 239

End-to-End Automatic Speech Recognition on PyTorch

Edgedict ⭐ 229

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Wav2vec2 Live ⭐ 218

A live speech recognition using Facebooks wav2vec 2.0 model.

Dacidian ⭐ 198

DaCiDian is an open-sourced chinese mandarin lexicon for automatic speech recognition(ASR)

Ctc Segmentation ⭐ 192

Segment an audio file and obtain utterance alignments. (Python package)

Asr Evaluation ⭐ 191

Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).

Using a Teletype Model 33 electromechanical terminal (https://twitch.tv/33asr)

End To End Slu ⭐ 175

PyTorch code for end-to-end spoken language understanding (SLU) with ASR-based transfer learning

Whisperhallu ⭐ 157

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts

Speecht ⭐ 156

An opensource speech-to-text software written in tensorflow

Py Kaldi Asr ⭐ 154

Some simple wrappers around kaldi-asr intended to make using kaldi's (online) decoders as convenient as possible.

Sova Asr ⭐ 149

SOVA ASR (Automatic Speech Recognition)

Ai edge toolbox，专门面向边端设备尤其是嵌入式RTOS平台，AI模型部署工具链，包括模型推理引擎和模型

Speech To Text Russian ⭐ 138

Проект для распознавания речи на русском языке на основе pykaldi.

Reazonspeech ⭐ 134

Construct large-scale Japanese audio corpus at home

Asr Study ⭐ 131

Implementation of all-neural speech recognition systems using Keras and Tensorflow

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

Spokestack Python ⭐ 124

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Keras Kaldi ⭐ 124

Keras Interface for Kaldi ASR

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

Asr_syllable ⭐ 112

基于卷积神经网络的语音识别声学模型的研究

Code for end-to-end ASR with neural networks, build with TensorFlow

Elevateaipythonsdk ⭐ 111

ElevateAI - Speech-to-text API Python SDK

Deepgram Python Sdk ⭐ 110

Official Python SDK for Deepgram's automated speech recognition APIs.

Sepia Stt Server ⭐ 105

SEPIA server to support open-source speech recognition via WebSocket connection.

Las_mandarin_pytorch ⭐ 104

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

Rnn Transducer ⭐ 100

MXNet implementation of RNN Transducer (Graves 2012): Sequence Transduction with Recurrent Neural Networks

Pytorch Asr ⭐ 100

ASR with PyTorch

Listen Attend Spell ⭐ 98

A PyTorch implementation of Listen, Attend and Spell (LAS), an End-to-End ASR framework.

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

Whisper Auto Transcribe ⭐ 91

Auto transcribe tool based on whisper

Mongolian Speech Recognition ⭐ 86

Mongolian speech recognition with PyTorch

Asr For Chinese Pipeline ⭐ 85

Google Summer of Code 2018 Project: Automatic Speech Recognition for Speech-to-Text on Chinese

Zasr_tensorflow ⭐ 85

Mandarin ASR system based on tensorflow

SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition

A release version for https://github.com/athena-team/athena

Zerospeech Tts Without T ⭐ 79

A Pytorch implementation for the ZeroSpeech 2019 challenge.

PyTorch Implementations for End-to-End Automatic Speech Recognition

Asr Wav2vec Finetune ⭐ 76

⚡ Finetune Wa2vec 2.0 For Speech Recognition

Tools for ASR Corpus Generation from Online Video

Ktspeechcrawler ⭐ 73

Automatically constructing corpus for automatic speech recognition from YouTube videos

Cgmm Mvdr ⭐ 71

Implementation of the CGMM-MVDR beamforming

Wav2letter ⭐ 70

Speech Recognition model based off of FAIR research paper built using Pytorch.

Time delay neural network (TDNN) implementation in Pytorch using unfold method

Punctuationmodel ⭐ 69

中文标点符号模型，可以给文本添加标点符号。

Vakyansh Wav2vec2 Experimentation ⭐ 67

Repository containing experimentation platform on how to train, infer on wav2vec2 models.

Viet Asr ⭐ 65

VietASR - Vietnamese Automatic Speech Recognition

Cloud Asr ⭐ 63

Cloud-based Automatic Speech Recognition (ASR) platform and a public ASR webservice.

Simple_diarizer ⭐ 63

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Squeezeformer ⭐ 60

PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)

Asr_benchmark ⭐ 60

Program to benchmark various speech recognition APIs

Transfusion Asr ⭐ 59

Transcribing Speech with Multinomial Diffusion, training code and models.

Asr_word ⭐ 59

采用端到端方法构建声学模型，以字为建模单元，采用DCNN-CTC网络结构。

Pb_chime5 ⭐ 58

Speech enhancement system for the CHiME-5 dinner party scenario

Alimeeting ⭐ 57

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

ADvISER is a flexible framework to encourage task-oriented dialog system research & development

تفريغ المواد المرئية أو المسموعة إلى نصوص

Kaldi Yesno Tutorial ⭐ 55

Tutorial on Kaldi for Brandeis ASR course

maracas is a library for corrupting audio files with additive and convolutive noise.

Yoruba Text ⭐ 51

Yorùbá language training text for NLP, ASR and TTS tasks

Speech Transformer Tf2.0 ⭐ 51

transformer for ASR-systerm (via tensorflow2.0)

Alex Asr ⭐ 49

Online decoder for Kaldi NNET2 and GMM speech recognition models with Python bindings.

Related Searches

Python Dataset (14,792)

Python Docker (14,113)

Python Machine Learning (14,099)

Python Tensorflow (13,736)

Python Command Line (13,351)

Python Deep Learning (13,092)

Python Artificial Intelligence (8,580)

Python Raspberry Pi (8,403)

Python Pytorch (7,877)

Python Server (7,793)

1-100 of 248 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.