Awesome Open Source

Programming Languages

Search results for speech to text

speech-to-text x

569 search results found

Whisper.unity ⭐ 218

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

Rnn_ctc ⭐ 216

Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Msedgetts ⭐ 213

A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API

Go Astibob ⭐ 211

Golang framework to build an AI that can understand and speak back to you, and everything else you want

Self Supervised Speech Recognition ⭐ 210

speech to text with self-supervised learning based on wav2vec 2.0 framework

Voice Overlay Android ⭐ 210

🗣 An overlay that gets your user’s voice permission and input as text in a customizable UI

Awesome Large Audio Models ⭐ 207

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

Deepspeech Server ⭐ 198

A testing server for a speech to text service based on coqui.ai

Willow Inference Server ⭐ 190

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

Hey Jetson ⭐ 189

Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.

Asr Audio Data Links ⭐ 187

A list of publically available audio data that anyone can download for ASR or other speech activities

Expressive_tacotron ⭐ 185

Tensorflow Implementation of Expressive Tacotron

Voice Assistant ⭐ 184

Voice assistant for Visual Studio Code.

Lobe Tts ⭐ 182

🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser

Speaker_adapted_tts ⭐ 181

Making a TTS model with 1 minute of speech samples within 10 minutes

Dictate.js ⭐ 179

A small Javascript library for browser-based real-time speech recognition, which uses Recorderjs for audio capture, and a WebSocket connection to the Kaldi GStreamer server for speech recognition.

Sayboard ⭐ 178

An open-source on-device voice IME (keyboard) for Android using the Vosk library.

Speechtotext Websockets Javascript ⭐ 178

SDK & Sample to do speech recognition using websockets in Javascript

Whisper Website ⭐ 174

Simple web application, which can be used to convert audio to subtitles by OpenAI's Whisper model

Speech To Text ⭐ 166

Real-time transcription using faster-whisper

Tacotron_asr ⭐ 165

Speech Recognition Using Tacotron

Ueazspeech ⭐ 162

This plugin integrates Azure Speech Cognitive Services in Unreal Engine.

React Speech Kit ⭐ 159

React hooks for Speech Recognition and Speech Synthesis

Chinese Automatic Speech Recognition ⭐ 157

Chinese speech recognition

Speech to Text and KB input captions for OBS, VRChat, Twitch chat and Discord

Alexa Rubykit ⭐ 156

Amazon Echo Alexa's App Kit Ruby Implementation

Speecht ⭐ 156

An opensource speech-to-text software written in tensorflow

Runtimespeechrecognizer ⭐ 153

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.

Go Astideepspeech ⭐ 152

Golang bindings for Mozilla's DeepSpeech speech-to-text library

Sova Asr ⭐ 149

SOVA ASR (Automatic Speech Recognition)

Speech2text ⭐ 148

A Deep-Learning-Based Persian Speech Recognition System

Web Voice Processor ⭐ 147

A library for real-time voice processing in web browsers

Converse ⭐ 147

Conversational text Analysis using various NLP techniques

Zzz Retired__openstt ⭐ 146

RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:

Synthalingua ⭐ 144

Synthalingua - Real Time Translation

Bentochain ⭐ 144

A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models from 🤗 Hugging Face, and 🍱 BentoML.

Playwright Recaptcha ⭐ 142

A Python library for solving reCAPTCHA v2 and v3 with Playwright

Gdansk Ai ⭐ 139

🦭 Full stack AI voice chatbot (speech-to-text, LLM, text-to-speech) with integrations to Auth0, OpenAI, Google Cloud and Stripe - Web App, Web API and AI API

Deep_avsr ⭐ 138

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

Speech To Text Russian ⭐ 138

Проект для распознавания речи на русском языке на основе pykaldi.

Obs Localvocal ⭐ 136

OBS plugin for local speech recognition and captioning using AI

Audiototext ⭐ 132

Transcribe and translate audio to text using Whisper and DeepL.

⦠ Angle: new speakable syntax for python 💡

Tevr Asr Tool ⭐ 132

State-of-the-art (ranked #1 Aug 2022) German Speech Recognition in 284 lines of C++. This is a 100% private 100% offline 100% free CLI tool.

Awesome Ai Services ⭐ 127

An overview of the AI-as-a-service landscape

Tensorflow Ctc Speech Recognition ⭐ 127

Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).

Speechrecognizerbutton ⭐ 126

UIButton subclass with push to talk recording, speech recognition and Siri-style waveform view.

Awesome Korean Speech Recognition ⭐ 125

한국어 음성인식 STT API 리스트. 각 성능 벤치마크.

Spokestack Python ⭐ 124

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Noveldokusha ⭐ 124

Android web novel reader

Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.

Elevateaijavasdk ⭐ 121

Java SDK for ElevateAI

Whisper Writer ⭐ 118

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

Automatic Speech Recognition ⭐ 116

🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)

Elevateaidotnetsdk ⭐ 115

.Net core 6 SDK for ElevateAI

Sapphire ⭐ 113

A free and open source replacement for Google Assistant on Android devices, meant to integrate with the Sapphire Framework. It contains both speech-to-text and text-to-speech services. It does not require Google services or network connectivity

Chatgpt Voice ⭐ 112

Have a conversation with ChatGPT. Casually 🔈 🤖 ⚡️

Elevateaipythonsdk ⭐ 111

ElevateAI - Speech-to-text API Python SDK

Casr Demo ⭐ 109

基于Flask Web的中文自动语音识别演示系统,包含语音识别、语音合成、声纹识别之说话人识别。

Obsidian Transcription ⭐ 107

Obsidian plugin to create high-quality transcriptions from markdown linked audio files

Las_mandarin_pytorch ⭐ 104

Listen, attend and spell Model and a Chinese Mandarin Pretrained model (中文-普通话 ASR模型)

Machine Learning Training Utilities (for TensorFlow and PyTorch)

Awesome Russian Speech ⭐ 97

Russian speech technology links

Talk2gpt ⭐ 92

GPT-3 client for Windows and Unix with memories management that supports both text and speech in any language. Includes a free text2image

Whisper Auto Transcribe ⭐ 91

Auto transcribe tool based on whisper

Simple Obs Stt ⭐ 91

Speech-to-text and keyboard input captions for OBS.

Chrome Web Speech Api ⭐ 90

Chrome Web Speech API

Let your talking do the code

Talk with ChatGPT using your VOICE

GPT-3 client for Windows and Unix with memories management that supports both text and speech in any language.

Companion ⭐ 86

Generative-AI-Powered Foreign-Language Private Tutor

Goodbyecatpcha ⭐ 86

An asynchronized Python library to automate solving ReCAPTCHA v2 using audio and image recognition

Mongolian Speech Recognition ⭐ 86

Mongolian speech recognition with PyTorch

Vue Pwa Speech ⭐ 83

A Vue2 Performs synchronous speech recognition Speech to text Google Cloud Speech With Progressive Web App

Deepspeech.mxnet ⭐ 82

A MXNet implementation of Baidu's DeepSpeech architecture

Deepgram Js Sdk ⭐ 81

Official JavaScript SDK for Deepgram's automated speech recognition APIs.

Kaldi Serve ⭐ 79

Server framework for Kaldi ASR Toolkit

Transcribee ⭐ 79

open source audio and video transcription software

B.e.n.j.i. ⭐ 79

B.E.N.J.I.- The Impossible Missions Force's digital assistant

Voice Input Button2 ⭐ 77

New version of voice input button using new interface of iflytek voice dictation (the stream version). 基于讯飞新版语音听写(流式版) api 的语音输入按钮 vue 组件。

Asr Wav2vec Finetune ⭐ 76

⚡ Finetune Wa2vec 2.0 For Speech Recognition

Kim Voice Assistant ⭐ 76

Kim，your personal voice kit for Home Inteligence.

Listen Attend And Spell ⭐ 75

Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project utilizes input pipeline and estimator API of Tensorflow, which makes the training and evaluation truly end-to-end.

Unsuperior Ai Waifu ⭐ 73

AI waifu that can run on your phone or PC

Nativescript Speech Recognition ⭐ 73

💬 Speech to text, using the awesome engines readily available on the device.

Awesome Openai Whisper ⭐ 72

A curated list of awesome OpenAI's Whisper

Cognitive Services Voice Assistant ⭐ 72

Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client application for your bot or Custom Command service. You will also be able to easily deploy a working Custom Command based Voice Assistant to your own Azure subscription

Keenasr Ios Poc ⭐ 71

Proof of concept app that demonstrates use of KeenASR SDK in ObjC. WE ARE HIRING: https://keenresearch.com/careers.html

Wav2letter ⭐ 70

Speech Recognition model based off of FAIR research paper built using Pytorch.

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with Speaker Recognition) on IBM Cloud.

Ossspeechkit ⭐ 67

OSSSpeechKit offers a native iOS Speech wrapper for AVFoundation and Apple's Speech.

Wav2letter.pytorch ⭐ 67

A fully convolution-network for speech-to-text, built on pytorch.

Viet Asr ⭐ 65

VietASR - Vietnamese Automatic Speech Recognition

Openai Whisper Talk ⭐ 65

openai-whisper-talk is a sample voice conversation application powered by OpenAI technologies such as Whisper, Completions, Embeddings, and the latest Text-to-Speech. The application is built using Nuxt, a Javascript framework based on Vue.js.

Vue Speech Streaming ⭐ 64

A Vue2 Streaming Speech Recognition Speech to text with Google Cloud Speech

Simple_diarizer ⭐ 63

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

Inimesed ⭐ 63

An Android app that lets you search your contacts by voice. Internet not required. Based on Pocketsphinx. Uses Estonian acoustic models.

Russian_stt_text_normalization ⭐ 63

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks

Deepspeech Websocket Server ⭐ 62

Server & client for DeepSpeech using WebSockets for real-time speech recognition in separate environments

101-200 of 569 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.