Awesome Open Source

Programming Languages

Search results for speech recognition

speech-recognition x

1,151 search results found

Whisper_mic ⭐ 560

Project that allows one to use a microphone with OpenAI whisper.

Treasure Of Transformers ⭐ 541

💁 Awesome Treasure of Transformers Models for Natural Language processing contains papers, videos, blogs, official repo along with colab Notebooks. 🛫☑️

Cheetah ⭐ 537

On-device streaming speech-to-text engine powered by deep learning

Paddlepaddle Deepspeech ⭐ 536

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows， Jetson开发板预测。

Free Spoken Digit Dataset ⭐ 518

A free audio dataset of spoken digits. Think MNIST for audio.

Interspeech 2023 Papers ⭐ 513

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

Storytoolkitai ⭐ 504

An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and other AI models

Whisper Finetune ⭐ 502

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Whisperboard ⭐ 498

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

Whisper Standalone Win ⭐ 488

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Voice Overlay Ios ⭐ 485

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

Achieve your goals and keep your data private with Lotti. This life tracking app is designed to help you stay motivated and on track, all while keeping your personal information safe and secure. Now with on-device speech recognition.

Awesome Kaldi ⭐ 478

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Java Speech Api ⭐ 468

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Tts Voice Wizard ⭐ 467

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

Neural_sp ⭐ 466

End-to-end ASR/LM implementation with PyTorch

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conforme

React Speech Recognition ⭐ 461

💬Speech recognition for your React app

Ai Waifu Vtuber ⭐ 457

AI Vtuber for Streaming on Youtube/Twitch

Ctcwordbeamsearch ⭐ 453

Connectionist Temporal Classification (CTC) decoder with dictionary and language model.

Uspeech ⭐ 452

Speech recognition toolkit for the arduino

Whishper ⭐ 443

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

Speech Backbones ⭐ 429

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Alan Sdk Pcf ⭐ 426

Build a voice assistant for any application created with Microsoft Power Apps

Deep learning for audio processing

Zamia Speech ⭐ 413

Open tools and data for cloudless automatic speech recognition

Allosaurus ⭐ 411

Allosaurus is a pretrained universal phone recognizer for more than 2000 languages

Specaugment ⭐ 411

A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain

Speechkitt ⭐ 409

🗣 A flexible GUI for Speech Recognition

Swiftwhisper ⭐ 398

🎤 The easiest way to transcribe audio in Swift

Nmtpytorch ⭐ 395

Sequence-to-Sequence Framework in PyTorch

Leopard ⭐ 390

On-device speech-to-text engine powered by deep learning

Phonetisaurus ⭐ 379

Phonetisaurus G2P

Speech-to-text server framework with next-gen Kaldi

Spec_augment ⭐ 374

🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

Pocketsphinx Python ⭐ 367

Python interface to CMU Sphinxbase and Pocketsphinx libraries

Dragonfly ⭐ 366

Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS), Windows Speech Recognition (WSR), Kaldi and CMU Pocket Sphinx

Dragonfly ⭐ 365

ARCHIVED! - Speech recognition framework allowing powerful Python-based scripting and extension of Dragon NaturallySpeaking (DNS) and Windows Speech Recognition (WSR)

Huggingsound ⭐ 357

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Whisper.net ⭐ 355

Whisper.net. Speech to text made simple using Whisper Models

Leaderboard ⭐ 351

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Unispeech ⭐ 328

UniSpeech - Large Scale Self-Supervised Learning for Speech

Speech_recognition ⭐ 327

A Flutter plugin to use speech recognition on iOS & Android (Swift/Java)

Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.

Caffe Speech Recognition ⭐ 320

Speech Recognition with the Caffe deep learning framework, migrating to

Parrots ⭐ 318

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine for Chinese. 中文语音识别、文字转语音，基于语音库实现，易扩展。

Android Speech ⭐ 317

Android speech recognition and text to speech made easy

Edenai Apis ⭐ 313

Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

Opentransformer ⭐ 310

A No-Recurrence Sequence-to-Sequence Model for Speech Recognition

Kaldi Active Grammar ⭐ 305

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

A List of Big Models

Langhelper ⭐ 292

Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.

Nonautoreggenprogress ⭐ 290

Tracking the progress in non-autoregressive generation (translation, transcription, etc.)

Libfaceid ⭐ 290

libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.

VOSK Speech Recognition Toolkit

Pocketsphinx Go ⭐ 286

CMU PocketSphinx for Golang, a lightweight speech recognition engine.

Deepspeech German ⭐ 284

Automatic Speech Recognition (ASR) - German

Whisper Youtube ⭐ 282

🔉 Youtube Videos Transcription with OpenAI's Whisper

Tensorflow_end2end_speech_recognition ⭐ 275

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Speech Recognition Uk ⭐ 262

Speech Recognition for Ukrainian

Livewhisper ⭐ 261

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

Fastasr ⭐ 259

这是一个用C++实现ASR推理的项目，它依赖很少，安装也很简单，推理速度很快，在树莓派4B等ARM平支持的模型是由Google的Transformer模型中优化而来，数据集是开源wenetspeech 所以识别效果也很好，可以媲美许多商用的ASR软件。

Attention Lvcsr ⭐ 259

End-to-End Attention-Based Large Vocabulary Speech Recognition

An Android app that offers speech-to-text user interfaces to other apps

Bigcidian ⭐ 248

Pronunciation lexicon covering both English and Chinese languages for Automatic Speech Recognition.

Deep Learning Papers For Fish ⭐ 248

a list of pappers in deep learning for new-comes.

Kerasdeepspeech ⭐ 244

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Jarvis Chatgpt ⭐ 242

A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.

End2end Asr Pytorch ⭐ 239

End-to-End Automatic Speech Recognition on PyTorch

Vosk Browser ⭐ 238

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Kaldiio ⭐ 232

A pure python module for reading and writing kaldi ark files

Gpt Voice Conversation Chatbot ⭐ 232

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

Edgedict ⭐ 229

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Aims to create a comprehensive voice toolkit for training, testing, and deploying speaker verification systems.

Modular OSC program creator, toolkit, and router made for VRChat. Show your heartrate, time, hardware stats, speech to text, control Spotify, and more! Includes drag-and-drop prefabs for your avatar.

Stanford Ctc ⭐ 226

Neural net code for lexicon-free speech recognition with connectionist temporal classification

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".

Voicestreamai ⭐ 222

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Whisper_dart ⭐ 220

speech recognition in dart support all audio format and support server side client side, + support all language, only support in cpu only

Whisper.unity ⭐ 218

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

Wav2vec2 Live ⭐ 218

A live speech recognition using Facebooks wav2vec 2.0 model.

Whisper.rn ⭐ 217

React Native binding of whisper.cpp.

Rnn_ctc ⭐ 216

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

Aimybox Android Assistant ⭐ 215

Embeddable custom voice assistant for Android applications

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Whisper At ⭐ 212

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"

Kaldi Offline Transcriber ⭐ 211

Offline transcription system for Estonian using Kaldi

Kaldi-based Korean ASR (한국어 음성인식) open-source project

Self Supervised Speech Recognition ⭐ 210

speech to text with self-supervised learning based on wav2vec 2.0 framework

Voice Overlay Android ⭐ 210

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

Command line speech recognition and transcription for macOS

Ai Audio Datasets ⭐ 199

This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.

Deepspeech Server ⭐ 198

A testing server for a speech to text service based on coqui.ai

Vakyansh Models ⭐ 196

Open source speech to text models for Indic Languages

Sepia Docs ⭐ 193

Documentation and Wiki for SEPIA. Please post your questions and bug-reports here in the issues section! Thank you :-)

The CIDLib general purpose C++ development environment

Cordova Plugin Speechrecognition ⭐ 191

🎤 Cordova Plugin for Speech Recognition

Related Searches

Python Speech Recognition (876)

101-200 of 1,151 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.