Awesome Open Source

Programming Languages

Search results for text to speech

text-to-speech x

609 search results found

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Leon ⭐ 13,937

🧠 Leon is your open-source personal assistant.

Openvoice ⭐ 13,002

Instant voice cloning by MyShell.

NeMo: a toolkit for conversational AI

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Vall E X ⭐ 6,055

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

Emotivoice ⭐ 5,739

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Gpt Sovits ⭐ 5,184

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Silero Models ⭐ 4,088

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Tensorflowtts ⭐ 3,698

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Styletts2 ⭐ 3,464

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Amphion ⭐ 3,319

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Diffsinger ⭐ 3,123

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Pyvideotrans ⭐ 3,054

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，并添加配音

Awesome Prompt Engineering ⭐ 2,780

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc

Espeak Ng ⭐ 2,645

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

Piper ⭐ 2,586

A fast, local neural text to speech system

Edge Tts ⭐ 2,532

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Vall E ⭐ 2,257

An unofficial PyTorch implementation of the audio LM VALL-E

Tacotron 2 ⭐ 2,167

DeepMind's Tacotron-2 Tensorflow implementation

Aeneas ⭐ 2,079

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Python library and CLI tool to interface with Google Translate's text-to-speech API

Marytts ⭐ 1,850

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

Wavernn ⭐ 1,761

WaveRNN Vocoder + TTS

Pyttsx3 ⭐ 1,756

Offline Text To Speech synthesis for python

Alan Sdk Flutter ⭐ 1,742

Conversational AI SDK for Flutter to build AI-powered voice assistants for Flutter applications (iOS and Android)

Alan Sdk Android ⭐ 1,732

Conversational AI SDK for Android to build AI-powered voice assistants for Android applications (Java, Kotlin)

Vall E ⭐ 1,560

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Elevenlabs Python ⭐ 1,521

The official Python API for ElevenLabs Text to Speech.

Alan Sdk Ionic ⭐ 1,515

In-App assistant SDK to build a multimodal conversational UX for applications created with Ionic (React, Angular, Vue)

Parallelwavegan ⭐ 1,427

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Openseq2seq ⭐ 1,393

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Hifi Gan ⭐ 1,376

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Rhvoice ⭐ 1,364

a free and open source speech synthesizer for Russian and other languages

Dragonfire ⭐ 1,294

the open-source virtual assistant for Ubuntu based Linux distributions

Read Aloud ⭐ 1,202

An awesome browser extension that reads aloud webpage content with one click

Merlin ⭐ 1,189

This is now the official location of the Merlin project.

Alan Sdk Cordova ⭐ 1,070

In-App assistant SDK to build a multimodal conversational UX for Apache Cordova applications

Xzvoice ⭐ 1,031

Free and open source text-to-speech software

Transformertts ⭐ 977

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.

Tts Generation Webui ⭐ 970

TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet)

Botium Speech Processing ⭐ 938

Botium Speech Processing

Vonage Php Sdk Core ⭐ 887

Vonage REST API client for PHP. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.

Voice Cloning App ⭐ 879

A Python/Pytorch app for easily synthesising human voices

Open Speech Corpora ⭐ 830

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

Cognitive Speech Tts ⭐ 775

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

Multilingual_text_to_speech ⭐ 740

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Eddiscovery ⭐ 731

Captains log and 3d star map for Elite Dangerous

Realtimetts ⭐ 730

Converts text to speech in realtime

Audio Webui ⭐ 698

A webui for different audio related Neural Networks

Diffwave ⭐ 628

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Augmentative and Alternative Communication (AAC) system with text-to-speech for the browser

Transformer Tts ⭐ 599

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

Voice Builder ⭐ 591

An opensource text-to-speech (TTS) voice building tool

Alan Sdk Reactnative ⭐ 560

In-App assistant SDK to build a multimodal conversational UX for applications created with React Native (iOS, Android)

Flutter_tts ⭐ 529

Flutter Text to Speech package

Whisperlive ⭐ 505

A nearly-live implementation of OpenAI's Whisper.

Forwardtacotron ⭐ 487

⏩ Generating speech in a single forward pass without any attention!

Tts Voice Wizard ⭐ 467

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

A simple text-to-speech client for Azure TTS API.

Parakeet ⭐ 459

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

Podcast Maker ⭐ 456

Fully automated video maker using motion graphics and text-to-speech synthesis to turn newsletters into daily YouTube videos.

Ims Toucan ⭐ 426

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Alan Sdk Pcf ⭐ 426

Build a voice assistant for any application created with Microsoft Power Apps

Google Speech V2 ⭐ 424

💬 Reverse Engineering Google's Speech To Text API (v2)

Companion application for Elite Dangerous

Voicebox Pytorch ⭐ 401

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Tiktok Voice ⭐ 394

Simple Python script to interact with the TikTok TTS API

Nnmnkwii ⭐ 375

Library to build speech synthesis systems designed for easy and fast prototyping.

eSpeak NG is an open source speech synthesizer that supports 101 languages and accents.

Vonage Node Sdk ⭐ 370

Vonage API client for Node.js. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.

Storyteller ⭐ 368

Multimodal AI Story Teller, built with Stable Diffusion, GPT, and neural text-to-speech

Bark Voice Cloning Hubert Quantizer ⭐ 365

The code for the bark-voicecloning model. Training and inference.

Prodiff ⭐ 352

PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline

Hms Ml Demo ⭐ 333

HMS ML Demo provides an example of integrating Huawei ML Kit service into applications. This example demonstrates how to integrate services provided by ML Kit, such as face detection, text recognition, image segmentation, asr, and tts.

Bark.cpp ⭐ 315

Port of Suno AI's Bark in C/C++ for fast inference

Edenai Apis ⭐ 313

Eden AI: simplify the use and deployment of AI technologies by providing a unique API that connects to the best possible AI engines

Styletts ⭐ 310

Official Implementation of StyleTTS

Vits2_pytorch ⭐ 286

unofficial vits2-TTS implementation in pytorch

Wavegrad ⭐ 283

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Matcha Tts ⭐ 276

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Generspeech ⭐ 265

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Speech Recognition Uk ⭐ 262

Speech Recognition for Ukrainian

Livewhisper ⭐ 261

A nearly-live implementation of OpenAI's Whisper, using sounddevice. Requires existing Whisper install.

Cross Lingual Voice Cloning ⭐ 253

Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.

Openai Chat Api Workflow ⭐ 249

🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT-3.5/GPT-4 🤖💬 It also allows image generation 🖼️, image understanding 👀, speech-to-text conversion 🎤, and text-to-speech synthesis 🔈

Wavegrad ⭐ 239

A fast, high-quality neural vocoder.

Google Tts ⭐ 235

Google TTS (Text-To-Speech) for node.js

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Sonos smart speaker controller API and command-line tools

Tts Cube ⭐ 216

End-2-end speech synthesis with recurrent neural networks

Vonage Ruby Sdk ⭐ 215

Vonage REST API client for Ruby. API support for SMS, Voice, Text-to-Speech, Numbers, Verify (2FA) and more.

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Msedgetts ⭐ 213

A simple Azure Speech Service module that uses the Microsoft Edge Read Aloud API

Portaspeech ⭐ 211

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Go Astibob ⭐ 211

Golang framework to build an AI that can understand and speak back to you, and everything else you want

Waveglow ⭐ 202

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

1-100 of 609 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.