Awesome Open Source

Programming Languages

Search results for speech to text

speech-to-text x

569 search results found

Whisper.cpp ⭐ 27,404

Port of OpenAI's Whisper model in C/C++

Deepspeech ⭐ 24,127

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

Leon ⭐ 13,937

🧠 Leon is your open-source personal assistant.

Kaldi ⭐ 13,453

kaldi-asr/kaldi is the official location of the Kaldi project.

NeMo: a toolkit for conversational AI

Faster Whisper ⭐ 8,711

Faster Whisper transcription with CTranslate2

Speech_recognition ⭐ 7,801

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Whisperx ⭐ 7,510

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Asrt_speechrecognition ⭐ 7,253

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Vosk Api ⭐ 6,633

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Annyang ⭐ 6,310

💬 Speech recognition for your site

Silero Models ⭐ 4,088

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Whisper Jax ⭐ 3,824

JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.

Pyvideotrans ⭐ 3,054

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，并添加配音

Lingvo ⭐ 2,776

Willow ⭐ 2,223

Open source, local, and self-hosted Amazon Echo/Google Home competitive Voice Assistant alternative

Tensorflow Speech Recognition ⭐ 2,150

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Kalliope ⭐ 1,652

Kalliope is a framework that will help you to create your own personal assistant.

Whisper Diarization ⭐ 1,538

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Openseq2seq ⭐ 1,393

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Nlp Models Tensorflow ⭐ 1,329

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

Whisper Asr Webservice ⭐ 1,317

OpenAI Whisper ASR Webservice API

Dragonfire ⭐ 1,294

the open-source virtual assistant for Ubuntu based Linux distributions

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Dc_tts ⭐ 1,148

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

Artyom.js ⭐ 1,125

A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

Botium Speech Processing ⭐ 938

Botium Speech Processing

Tensorflowasr ⭐ 890

⚡ TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords

Quillman ⭐ 880

A chat app that transcribes audio in real-time, streams back a response from a language model, and synthesizes this response as natural-sounding speech.

@voicybot Telegram bot main repository

Nonocaptcha ⭐ 847

An asynchronized Python library to automate solving ReCAPTCHA v2 using audio

Open Speech Corpora ⭐ 830

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Descriptive Deep Learning

Stephanie Va ⭐ 769

Stephanie is an open-source platform built specifically for voice-controlled applications as well as to automate daily tasks imitating much of an virtual assistant's work.

Adapt Intent Parser

Frogbase ⭐ 704

Transform audio-visual content into navigable knowledge.

Awesome Whisper ⭐ 703

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目

Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务，输出json、srt字幕带时间戳、纯文字格式

Nodejs Speech ⭐ 684

This repository is deprecated. All of its content and history has been moved to googleapis/google-cloud-node.

The official repository of the Eesen project

Open_stt ⭐ 671

Speech Demo ⭐ 652

语音api示例

Whisper Playground ⭐ 637

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

Whisper Ctranslate2 ⭐ 628

Whisper command line client compatible with original OpenAI client based on CTranslate2.

Speech To Text Benchmark ⭐ 570

speech to text benchmark framework

Audapolis ⭐ 569

an editor for spoken-word audio with automatic transcription

Whisper_mic ⭐ 560

Project that allows one to use a microphone with OpenAI whisper.

Mauisamples ⭐ 540

.NET MAUI Samples

Cheetah ⭐ 537

On-device streaming speech-to-text engine powered by deep learning

Paddlepaddle Deepspeech ⭐ 536

基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows， Jetson开发板预测。

Autosub ⭐ 525

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui

Whisperboard ⭐ 498

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

💬 /so.nus/ STT (speech to text) for Node with offline hotword detection

Whisper Standalone Win ⭐ 488

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Voice Overlay Ios ⭐ 485

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

Achieve your goals and keep your data private with Lotti. This life tracking app is designed to help you stay motivated and on track, all while keeping your personal information safe and secure. Now with on-device speech recognition.

Awesome Kaldi ⭐ 478

This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )

Java Speech Api ⭐ 468

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

Tts Voice Wizard ⭐ 467

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conforme

React Speech Recognition ⭐ 461

💬Speech recognition for your React app

Ai Waifu Vtuber ⭐ 457

AI Vtuber for Streaming on Youtube/Twitch

Whishper ⭐ 443

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

React Mic ⭐ 434

Record audio from a user's microphone and display a cool visualization.

Swiftwhisper ⭐ 398

🎤 The easiest way to transcribe audio in Swift

Proctoring Ai ⭐ 397

Creating a software for automatic monitoring in online proctoring

Leopard ⭐ 390

On-device speech-to-text engine powered by deep learning

Phonetisaurus ⭐ 379

Phonetisaurus G2P

Realtimestt ⭐ 359

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Huggingsound ⭐ 357

HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools

Whisper.net ⭐ 355

Whisper.net. Speech to text made simple using Whisper Models

Edenai Apis ⭐ 313

Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

Autoedit_2 ⭐ 311

Fast text based video editing, node Electron Os X desktop app, with Backbone front end.

Kaldi Active Grammar ⭐ 305

Python Kaldi speech recognition with grammars that can be set active/inactive dynamically at decode-time

Gp.nvim ⭐ 302

Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI]

Langhelper ⭐ 292

Striving to create a great Application with full functions of learning languages by ChatGPT, TTS, STT and other awesome AI models, supports talking, speaking assessment, memorizing words with contexts, Listening test, so on.

VOSK Speech Recognition Toolkit

Whisper Youtube ⭐ 282

🔉 Youtube Videos Transcription with OpenAI's Whisper

Tensorflow_end2end_speech_recognition ⭐ 275

End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)

Speech Recognition Uk ⭐ 262

Speech Recognition for Ukrainian

Livewhisper ⭐ 261

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

An Android app that offers speech-to-text user interfaces to other apps

Openai Chat Api Workflow ⭐ 249

🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-3.5/GPT-4 🤖💬 It also allows image generation 🖼️, image understanding 👀, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈

Openlrc ⭐ 247

Transcribe and translate voice into LRC file using Whisper and GPT. 使用whisper和gpt来转录、翻译你的音频为字幕文件。

Kerasdeepspeech ⭐ 244

A Keras CTC implementation of Baidu's DeepSpeech for model experimentation

Voicefixer_main ⭐ 244

General Speech Restoration

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

Vosk Browser ⭐ 238

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

Gpt Voice Conversation Chatbot ⭐ 232

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

Edgedict ⭐ 229

Working online speech recognition based on RNN Transducer. ( Trained model release available in release )

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Modular OSC program creator, toolkit, and router made for VRChat. Show your heartrate, time, hardware stats, speech to text, control Spotify, and more! Includes drag-and-drop prefabs for your avatar.

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Voicestreamai ⭐ 222

Near-Realtime audio transcription using self-hosted Whisper and WebSocket in Python/JS

Whisper_dart ⭐ 220

speech recognition in dart support all audio format and support server side client side, + support all language, only support in cpu only

Wav2vec2 Live ⭐ 218

A live speech recognition using Facebooks wav2vec 2.0 model.

1-100 of 569 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.