Awesome Open Source
Search results for python speech recognition
672 search results found
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
🧠 Leon is your open-source personal assistant.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
NeMo: a toolkit for conversational AI
Speech recognition module for Python, supporting several engines and APIs, online and offline.
End-to-End Speech Processing Toolkit
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
A PyTorch-based Speech Toolkit
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Faster Whisper transcription with CTranslate2
Speech To Text Wavenet
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
A small speech recognizer
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Tensorflow Speech Recognition
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Kalliope is a framework that will help you to create your own personal assistant.
DELTA is a deep learning based natural language and speech processing platform.
Machine Learning Resources, Practice and Research
Lip Reading Deeplearning
🔓 Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
the open-source virtual assistant for Ubuntu based Linux distributions
Speech Emotion Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
A Python wrapper for Kaldi
A Fundamental End-to-End Speech Recognition Toolkit
Whisper Asr Webservice
OpenAI Whisper ASR Webservice API
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit
Kaldi Gstreamer Server
Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.
💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.
Descriptive Deep Learning
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.
SincNet is a neural architecture for efficiently processing raw audio samples.
Examples of how to use or integrate DeepSpeech
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
Tools for handling speech data in machine learning projects.
PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)
A PyTorch Implementation of End-to-End Models for Speech-to-Text
💬 An On-Premises, Streaming Speech Recognition System
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/
A lightweight, simple-to-use, RNN wake word listener
Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.
Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
Speech To Text Benchmark
speech to text benchmark framework
On-device Speech-to-Intent engine powered by deep learning
Irene Voice Assistant
Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.
Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, etc.
Treasure Of Transformers
💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️
Free Spoken Digit Dataset
A free audio dataset of spoken digits. Think MNIST for audio.
On-device streaming speech-to-text engine powered by deep learning
📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）
End-to-end ASR/LM implementation with PyTorch
Whisper command line client compatible with original OpenAI client based on CTranslate2.
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
Allosaurus is a pretrained universal phone recognizer for more than 2000 languages
An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and Davinci Resolve Studio.
On-device speech-to-text engine powered by deep learning
Python interface to CMU Sphinxbase and Pocketsphinx libraries
ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx
UniSpeech - Large Scale Self-Supervised Learning for Speech
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
Kaldi Active Grammar
Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time
Speech-to-text server framework with next-gen Kaldi
A List of Big Models
libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.
VOSK Speech Recognition Toolkit
Automatic Speech Recognition (ASR) - German
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。
End-to-End Attention-Based Large Vocabulary Speech Recognition
Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines
Speech Recognition Uk
Speech Recognition for Ukrainian
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.
End2end Asr Pytorch
End-to-End Automatic Speech Recognition on PyTorch
A pure python module for reading and writing kaldi ark files
Working online speech recognition based on RNN Transducer. ( Trained model release available in release )
Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.
Neural net code for lexicon-free speech recognition with connectionist temporal classification
A live speech recognition using Facebooks wav2vec 2.0 model.
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
Kaldi Offline Transcriber
Offline transcription system for Estonian using Kaldi
Self Supervised Speech Recognition
speech to text with self-supervised learning based on wav2vec 2.0 framework
Gpt Voice Conversation Chatbot
Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.
A testing server for a speech to text service based on coqui.ai
Python module for evaluating ASR hypotheses (e.g. word error rate, word recognition rate).
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Ai Waifu Vtuber
AI Vtuber for Streaming on Youtube/Twitch
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
Voice Activity Detection based on Deep Learning & TensorFlow
Python Django (28,897)
Python Machine Learning (20,195)
Python Flask (17,122)
Python Pytorch (16,179)
Python Dataset (14,792)
Python Tensorflow (14,061)
Python Docker (13,757)
Python Command Line (13,403)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
1-100 of 672 search results
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.