Awesome Open Source

Programming Languages

Search results for speech recognition

speech-recognition x

1,151 search results found

Transformers ⭐ 124,049

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Whisper.cpp ⭐ 27,404

Port of OpenAI's Whisper model in C/C++

Deepspeech ⭐ 24,127

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Leon ⭐ 13,937

🧠 Leon is your open-source personal assistant.

Kaldi ⭐ 13,453

kaldi-asr/kaldi is the official location of the Kaldi project.

Deeplearningexamples ⭐ 12,073

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Deep Learning Drizzle ⭐ 10,767

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!

Paddlespeech ⭐ 10,225

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

NeMo: a toolkit for conversational AI

Faster Whisper ⭐ 8,711

Faster Whisper transcription with CTranslate2

Speech_recognition ⭐ 7,801

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Whisperx ⭐ 7,510

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Asrt_speechrecognition ⭐ 7,253

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Vosk Api ⭐ 6,633

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Wav2letter ⭐ 6,326

Facebook AI Research's Automatic Speech Recognition Toolkit

Annyang ⭐ 6,310

💬 Speech recognition for your site

Openvino ⭐ 5,979

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

Silero Models ⭐ 4,088

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Whisper Jax ⭐ 3,824

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Wenet ⭐ 3,717

Production First and Production Ready End-to-End Speech Recognition Toolkit

Pocketsphinx ⭐ 3,620

A small speech recognizer

Speech To Text Wavenet ⭐ 3,586

Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow

Porcupine ⭐ 3,423

On-device wake word detection powered by deep learning

Awesome Speech Recognition Speech Synthesis Papers ⭐ 2,869

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Lingvo ⭐ 2,776

Distil Whisper ⭐ 2,760

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.

Automatic_speech_recognition ⭐ 2,743

End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow

Ml Road ⭐ 2,742

Machine Learning Resources, Practice and Research

Alan Sdk Web ⭐ 2,377

Actionable AI SDK for Web to enable text and voice conversations with actions (JavaScript, React, Angular, Vue, Ember, Electron)

Funasr ⭐ 2,315

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.

Willow ⭐ 2,223

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

Tensorflow Speech Recognition ⭐ 2,150

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Pytorch Kaldi ⭐ 2,138

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Rhasspy ⭐ 2,036

Offline private voice assistant for many human languages

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Alan Sdk Ios ⭐ 1,909

Actionable AI SDK for iOS to enable text and voice conversations with actions (Swift, Objective-C)

Alan Sdk Flutter ⭐ 1,742

Conversational AI SDK for Flutter to build AI-powered voice assistants for Flutter applications (iOS and Android)

Wer_are_we ⭐ 1,734

Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.

Alan Sdk Android ⭐ 1,732

Conversational AI SDK for Android to build AI-powered voice assistants for Android applications (Java, Kotlin)

Voice ⭐ 1,681

🎤 React Native Voice Recognition library for iOS and Android (Online and Offline Support)

Kalliope ⭐ 1,652

Kalliope is a framework that will help you to create your own personal assistant.

Delta ⭐ 1,584

DELTA is a deep learning based natural language and speech processing platform.

Julius ⭐ 1,558

Open-Source Large Vocabulary Continuous Speech Recognition Engine

Whisper Diarization ⭐ 1,538

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Alan Sdk Ionic ⭐ 1,515

In-App assistant SDK to build a multimodal conversational UX for applications created with Ionic (React, Angular, Vue)

Lip Reading Deeplearning ⭐ 1,433

🔓 Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

Project_alias ⭐ 1,421

Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.

Ios_ml ⭐ 1,406

List of Machine Learning, AI, NLP solutions for iOS. The most recent version of this article can be found on my blog.

Openseq2seq ⭐ 1,393

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Awesome Diarization ⭐ 1,384

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

Whisper Asr Webservice ⭐ 1,317

OpenAI Whisper ASR Webservice API

Whisper Turbo ⭐ 1,313

Cross-Platform, GPU Accelerated Whisper 🏎️

Dragonfire ⭐ 1,294

the open-source virtual assistant for Ubuntu based Linux distributions

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Speech Emotion Analyzer ⭐ 1,155

The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Artyom.js ⭐ 1,125

A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

Alan Sdk Cordova ⭐ 1,070

In-App assistant SDK to build a multimodal conversational UX for Apache Cordova applications

Pykaldi ⭐ 954

A Python wrapper for Kaldi

Espresso ⭐ 930

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Tensorflowasr ⭐ 890

⚡ TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords

Quillman ⭐ 880

A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.

Kaldi Gstreamer Server ⭐ 865

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.

Rhasspy ⭐ 832

Rhasspy voice assistant for offline home automation

Open Speech Corpora ⭐ 830

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Speechpy ⭐ 828

💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

an open-source implementation of sequence-to-sequence based speech processing engine

Descriptive Deep Learning

Conformer ⭐ 809

[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

Vosk Server ⭐ 802

WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Tools for handling speech data in machine learning projects.

Stephanie Va ⭐ 769

Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.

Sincnet ⭐ 764

SincNet is a neural architecture for efficiently processing raw audio samples.

Mycroft Precise ⭐ 749

A lightweight, simple-to-use, RNN wake word listener

Deepspeech Examples ⭐ 739

Examples of how to use or integrate DeepSpeech

J.a.r.v.i.s ⭐ 719

Personal Assistant built using python libraries. It does almost anything which includes sending emails, Optical Text Recognition, Dynamic News Reporting at any time with API integration, Todo list generator, Opens any website with just a voice command, Plays Music, Wikipedia searching, Dictionary with Intelligent Sensing i.e. auto spell checking, Weather Reporting i.e. temp, wind speed, humidity, YouTube searching, Google Map searching, Youtube Downloading, etc.

Salmonn ⭐ 710

SALMONN: Speech Audio Language Music Open Neural Network

Adapt Intent Parser

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目

Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务，输出json、srt字幕带时间戳、纯文字格式

Sherpa Ncnn ⭐ 673

Real-time speech recognition using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Raspberry Pi, VisionFive2, etc.

The official repository of the Eesen project

A PyTorch Implementation of End-to-End Models for Speech-to-Text

Openspeech ⭐ 653

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Speech Demo ⭐ 652

语音api示例

Libreasr ⭐ 647

💬 An On-Premises, Streaming Speech Recognition System

Irene Voice Assistant ⭐ 644

Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.

Speecht5 ⭐ 638

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Whisper Playground ⭐ 637

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

Whisper Ctranslate2 ⭐ 628

Whisper command line client compatible with original OpenAI client based on CTranslate2.

📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

Chinese_text_normalization ⭐ 583

Chinese text normalization for speech processing

Dialogflow Android Client ⭐ 578

Android SDK for Dialogflow

Ctcdecoder ⭐ 577

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

On-device Speech-to-Intent engine powered by deep learning

Kospeech ⭐ 572

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

Speech To Text Benchmark ⭐ 570

speech to text benchmark framework

Alan Sdk Reactnative ⭐ 560

In-App assistant SDK to build a multimodal conversational UX for applications created with React Native (iOS, Android)

Related Searches

Python Speech Recognition (876)

1-100 of 1,151 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.