Awesome Open Source

Programming Languages

Search results for python speech recognition

speech-recognition x

531 search results found

Transformers ⭐ 127,491

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Leon ⭐ 13,937

🧠 Leon is your open-source personal assistant.

NeMo: a toolkit for conversational AI

Speech_recognition ⭐ 7,801

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Whisperx ⭐ 7,510

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Asrt_speechrecognition ⭐ 7,253

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Vosk Api ⭐ 6,633

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Pocketsphinx ⭐ 3,620

A small speech recognizer

Speech To Text Wavenet ⭐ 3,586

Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow

Lingvo ⭐ 2,776

Distil Whisper ⭐ 2,760

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Automatic_speech_recognition ⭐ 2,743

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Ml Road ⭐ 2,742

Machine Learning Resources, Practice and Research

Funasr ⭐ 2,315

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.

Tensorflow Speech Recognition ⭐ 2,150

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Pytorch Kaldi ⭐ 2,138

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Kalliope ⭐ 1,652

Kalliope is a framework that will help you to create your own personal assistant.

Lip Reading Deeplearning ⭐ 1,433

🔓 Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

Project_alias ⭐ 1,421

Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.

Openseq2seq ⭐ 1,393

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Whisper Asr Webservice ⭐ 1,317

OpenAI Whisper ASR Webservice API

Dragonfire ⭐ 1,294

the open-source virtual assistant for Ubuntu based Linux distributions

Speech Emotion Analyzer ⭐ 1,244

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Pykaldi ⭐ 954

A Python wrapper for Kaldi

Espresso ⭐ 930

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Quillman ⭐ 880

A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.

Kaldi Gstreamer Server ⭐ 865

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.

Speechpy ⭐ 828

💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

Descriptive Deep Learning

Conformer ⭐ 809

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Vosk Server ⭐ 802

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Tools for handling speech data in machine learning projects.

Stephanie Va ⭐ 769

Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.

Sincnet ⭐ 764

SincNet is a neural architecture for efficiently processing raw audio samples.

Mycroft Precise ⭐ 749

A lightweight, simple-to-use, RNN wake word listener

Deepspeech Examples ⭐ 739

Examples of how to use or integrate DeepSpeech

J.a.r.v.i.s ⭐ 719

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

Salmonn ⭐ 710

SALMONN: Speech Audio Language Music Open Neural Network

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目

Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务，输出json、srt字幕带时间戳、纯文字格式

Sherpa Ncnn ⭐ 673

Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, etc.

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Libreasr ⭐ 647

💬 An On-Premises, Streaming Speech Recognition System

Irene Voice Assistant ⭐ 644

Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.

Speecht5 ⭐ 638

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Whisper Playground ⭐ 637

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

Whisper Ctranslate2 ⭐ 628

Whisper command line client compatible with original OpenAI client based on CTranslate2.

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Ctcdecoder ⭐ 577

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

On-device Speech-to-Intent engine powered by deep learning

Speech To Text Benchmark ⭐ 570

speech to text benchmark framework

Whisper_mic ⭐ 560

Project that allows one to use a microphone with OpenAI whisper.

Treasure Of Transformers ⭐ 541

💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️

Cheetah ⭐ 537

On-device streaming speech-to-text engine powered by deep learning

Paddlepaddle Deepspeech ⭐ 536

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows， Jetson开发板预测。

Free Spoken Digit Dataset ⭐ 518

A free audio dataset of spoken digits. Think MNIST for audio.

Storytoolkitai ⭐ 504

An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and other AI models

Neural_sp ⭐ 466

End-to-end ASR/LM implementation with PyTorch

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conforme

Ai Waifu Vtuber ⭐ 457

AI Vtuber for Streaming on Youtube/Twitch

Allosaurus ⭐ 411

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Specaugment ⭐ 411

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Leopard ⭐ 390

On-device speech-to-text engine powered by deep learning

Speech-to-text server framework with next-gen Kaldi

Pocketsphinx Python ⭐ 367

Python interface to CMU Sphinxbase and Pocketsphinx libraries

Dragonfly ⭐ 366

Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx

Dragonfly ⭐ 365

ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)

Huggingsound ⭐ 357

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Leaderboard ⭐ 351

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Unispeech ⭐ 328

UniSpeech - Large Scale Self-Supervised Learning for Speech

Parrots ⭐ 318

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Edenai Apis ⭐ 313

Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

Opentransformer ⭐ 310

A No-Recurrence Sequence-to-Sequence Model for Speech Recognition

Kaldi Active Grammar ⭐ 305

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

A List of Big Models

Libfaceid ⭐ 290

libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.

VOSK Speech Recognition Toolkit

Deepspeech German ⭐ 284

Automatic Speech Recognition (ASR) - German

Tensorflow_end2end_speech_recognition ⭐ 275

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Speech Recognition Uk ⭐ 262

Speech Recognition for Ukrainian

Livewhisper ⭐ 261

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

Attention Lvcsr ⭐ 259

End-to-End Attention-Based Large Vocabulary Speech Recognition

Jarvis Chatgpt ⭐ 242

A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.

End2end Asr Pytorch ⭐ 239

End-to-End Automatic Speech Recognition on PyTorch

Kaldiio ⭐ 232

A pure python module for reading and writing kaldi ark files

Gpt Voice Conversation Chatbot ⭐ 232

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

Edgedict ⭐ 229

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.

Stanford Ctc ⭐ 226

Neural net code for lexicon-free speech recognition with connectionist temporal classification

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Voicestreamai ⭐ 222

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Wav2vec2 Live ⭐ 218

A live speech recognition using Facebooks wav2vec 2.0 model.

Rnn_ctc ⭐ 216

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

Whisper At ⭐ 212

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Kaldi Offline Transcriber ⭐ 211

Offline transcription system for Estonian using Kaldi

Self Supervised Speech Recognition ⭐ 210

speech to text with self-supervised learning based on wav2vec 2.0 framework

Deepspeech Server ⭐ 198

A testing server for a speech to text service based on coqui.ai

Related Searches

Python Django (28,897)

Python Machine Learning (20,195)

Python Flask (17,643)

Python Pytorch (15,131)

Python Dataset (14,792)

Python Docker (14,113)

Python Tensorflow (14,061)

Python Command Line (13,351)

Python Deep Learning (13,092)

Python Jupyter Notebook (12,976)

1-100 of 531 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.