Awesome Open Source

Programming Languages

Search results for speech synthesis

speech-synthesis x

472 search results found

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Leon ⭐ 13,937

🧠 Leon is your open-source personal assistant.

Deeplearningexamples ⭐ 12,073

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Paddlespeech ⭐ 10,011

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

NeMo: a toolkit for conversational AI

So Vits Svc Fork ⭐ 8,080

so-vits-svc fork with realtime support, improved interface and more features.

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Emotivoice ⭐ 5,739

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Silero Models ⭐ 4,088

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Tensorflowtts ⭐ 3,698

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Styletts2 ⭐ 3,464

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Amphion ⭐ 3,319

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Diffsinger ⭐ 3,123

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Awesome Speech Recognition Speech Synthesis Papers ⭐ 2,869

Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)

Tacotron ⭐ 2,845

A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)

Lingvo ⭐ 2,776

Espeak Ng ⭐ 2,645

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

Piper ⭐ 2,586

A fast, local neural text to speech system

Edge Tts ⭐ 2,532

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Whisperspeech ⭐ 2,419

An Open Source text-to-speech system built by inverting Whisper.

Chat With Gpt ⭐ 2,202

An open-source ChatGPT app with a voice

Tacotron 2 ⭐ 2,167

DeepMind's Tacotron-2 Tensorflow implementation

Deepvoice3_pytorch ⭐ 1,906

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Marytts ⭐ 1,850

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java

Wavernn ⭐ 1,761

WaveRNN Vocoder + TTS

Kalliope ⭐ 1,652

Kalliope is a framework that will help you to create your own personal assistant.

Openutau ⭐ 1,628

Open singing synthesis platform / Open source UTAU successor

Wavenet_vocoder ⭐ 1,617

WaveNet vocoder

Parallelwavegan ⭐ 1,427

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

Openseq2seq ⭐ 1,393

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Hifi Gan ⭐ 1,376

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Rhvoice ⭐ 1,364

a free and open source speech synthesizer for Russian and other languages

Merlin ⭐ 1,189

This is now the official location of the Merlin project.

Artyom.js ⭐ 1,125

A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.

Pororo ⭐ 1,081

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

World ⭐ 1,027

A high-quality speech analysis, manipulation and synthesis system

Software Automatic Mouth - Tiny Speech Synthesizer

Naturalspeech2 Pytorch ⭐ 950

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Melgan Neurips ⭐ 872

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Open Speech Corpora ⭐ 830

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

an open-source implementation of sequence-to-sequence based speech processing engine

Natspeech ⭐ 814

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

Flowtron ⭐ 789

Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer

Cognitive Speech Tts ⭐ 775

Microsoft Text-to-Speech API sample code in several languages, part of Cognitive Services.

Yourtts ⭐ 741

YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone

Multilingual_text_to_speech ⭐ 740

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Voicefixer ⭐ 735

General Speech Restoration

Realtimetts ⭐ 730

Converts text to speech in realtime

Fastspeech ⭐ 723

The Implementation of FastSpeech based on pytorch.

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Irene Voice Assistant ⭐ 644

Ирина - русский голосовой ассистент для работы оффлайн. Поддерживает скиллы через плагины.

Speecht5 ⭐ 638

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Diffwave ⭐ 628

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Digital Speech Decoder

Transformer Tts ⭐ 599

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

Voice Builder ⭐ 591

An opensource text-to-speech (TTS) voice building tool

RNN-based generative models for speech.

Xva Synth ⭐ 555

Machine learning based speech synthesis Electron app, with voices from specific characters from video games

Interspeech 2023 Papers ⭐ 513

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

Translations with speech synthesis in your terminal as a ruby gem

Thorsten Voice ⭐ 475

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.

Java Speech Api ⭐ 468

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.

A simple text-to-speech client for Azure TTS API.

Parakeet ⭐ 459

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

Ai Waifu Vtuber ⭐ 457

AI Vtuber for Streaming on Youtube/Twitch

PyTorch implementation of GAN-based text-to-speech synthesis and voice conversion (VC)

Speech Backbones ⭐ 429

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Ims Toucan ⭐ 426

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

A python wrapper for Speech Signal Processing Toolkit (SPTK).

Sprocket ⭐ 398

Voice Conversion Tool Kit

Tiktok Voice ⭐ 394

Simple Python script to interact with the TikTok TTS API

Cyclegan Vc2 ⭐ 391

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Kan Tts ⭐ 377

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-

Nnmnkwii ⭐ 375

Library to build speech synthesis systems designed for easy and fast prototyping.

eSpeak NG is an open source speech synthesizer that supports 101 languages and accents.

Prodiff ⭐ 352

PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline

Dl For Emo Tts ⭐ 350

💻 🤖 A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech 🔈

Bigvgan ⭐ 344

Official PyTorch implementation of BigVGAN (ICLR 2023)

Starganv2 Vc ⭐ 323

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Voice Conversion With Just Nearest Neighbors

Styletts ⭐ 310

Official Implementation of StyleTTS

Text2video ⭐ 294

ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".

Libfaceid ⭐ 290

libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.

Wavegrad ⭐ 283

Implementation of Google Brain's WaveGrad high-fidelity vocoder (paper: https://arxiv.org/pdf/2009.00713.pdf). First implementation on GitHub.

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Gst Tacotron ⭐ 270

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"

Soft Vc ⭐ 269

Soft speech units for voice conversion

Gst Tacotron ⭐ 266

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

Generspeech ⭐ 265

PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.

Speech Recognition Uk ⭐ 262

Speech Recognition for Ukrainian

Pytorchwavenetvocoder ⭐ 256

WaveNet-Vocoder implementation with pytorch.

Voicefixer_main ⭐ 244

General Speech Restoration

Wavegrad ⭐ 239

A fast, high-quality neural vocoder.

Esp8266sam ⭐ 234

Speech synthesis for ESP8266 using S.A.M. port

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

Whisper_dart ⭐ 220

speech recognition in dart support all audio format and support server side client side, + support all language, only support in cpu only

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

1-100 of 472 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.