Awesome Open Source

Programming Languages

Search results for deep learning speech synthesis

deep-learning x

speech-synthesis x

62 search results found

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Deeplearningexamples ⭐ 12,073

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

NeMo: a toolkit for conversational AI

So Vits Svc Fork ⭐ 8,080

so-vits-svc fork with realtime support, improved interface and more features.

Espnet ⭐ 7,563

End-to-End Speech Processing Toolkit

Emotivoice ⭐ 5,739

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Styletts2 ⭐ 3,464

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Openseq2seq ⭐ 1,393

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP

Hifi Gan ⭐ 1,376

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Merlin ⭐ 1,189

This is now the official location of the Merlin project.

Pororo ⭐ 1,081

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

Naturalspeech2 Pytorch ⭐ 950

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Melgan Neurips ⭐ 872

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Fastspeech ⭐ 723

The Implementation of FastSpeech based on pytorch.

Diffwave ⭐ 628

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Transformer Tts ⭐ 599

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

RNN-based generative models for speech.

Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.

Ims Toucan ⭐ 426

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Cyclegan Vc2 ⭐ 391

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Dl For Emo Tts ⭐ 350

💻 🤖 A summary on our attempts at using Deep Learning approaches for Emotional Text to Speech 🔈

Starganv2 Vc ⭐ 323

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

Styletts ⭐ 310

Official Implementation of StyleTTS

Text2video ⭐ 294

ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary".

Libfaceid ⭐ 290

libfaceid is a research framework for prototyping of face recognition solutions. It seamlessly integrates multiple detection, recognition and liveness models w/ speech synthesis and speech recognition.

Wavegrad ⭐ 239

A fast, high-quality neural vocoder.

Speech_dataset ⭐ 229

The dataset of Speech Recognition

Glow Tts ⭐ 199

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Ttslearn ⭐ 197

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Assem Vc ⭐ 194

Official Code for Assem-VC @ICASSP2022

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Audio Development Tools ⭐ 165

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

Meta Tts ⭐ 151

Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More up-to-date code is in "refactor" branch.

Comprehensive Transformer Tts ⭐ 146

A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

Neural Hmm ⭐ 143

Neural HMMs are all you need (for high-quality attention-free TTS)

Spokestack Python ⭐ 124

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Cross Modal Perceptionist ⭐ 116

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Msmc Tts ⭐ 100

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Tacotron Pytorch ⭐ 90

A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model

Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)

Pytorch Dc Tts ⭐ 85

Text to Speech with PyTorch (English and Mongolian)

Styletts Vc ⭐ 76

Official Implementation of StyleTTS-VC

Emotionalconversionstargan ⭐ 75

This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Comprehensive E2e Tts ⭐ 71

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

Persian Tts Coqui ⭐ 65

Persian/Farsi text to speech(TTS) training using coqui tts

Wavegrad2 ⭐ 63

Unofficial Pytorch Implementation of WaveGrad2

Source Filter Vae ⭐ 42

Learning and controlling the source-filter representation of speech with a variational autoencoder

Deep CNN networks for Speech Synthesis

Tts Arabic Pytorch ⭐ 37

TTS models for Arabic (Tacotron2, FastPitch)

Turkicasr ⭐ 35

A multilingual ASR model that can recognize ten Turkic languages—Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek.

Comprehensive Tacotron2 ⭐ 23

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

Text2speech ⭐ 21

Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023

Speechnet ⭐ 18

Automatic Speech Recognition

Styletts2 ⭐ 18

🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning

Fast Seamlessm4t Onnx ⭐ 16

ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation

Icrcyclegan Vc ⭐ 11

Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny

13.3 Hours Chinese Mandarin Synthesis Corpus Female Emotional ⭐ 11

Chinese Mandarin Synthesis Corpus-Female/Emotional

Deep Learning Tts Template ⭐ 8

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Dljeju2018coderepoasr ⭐ 8

Details on my work on using GANs for speech synthesis for improving Speech Recognition accuracy for ASR problem

Deep_throat ⭐ 8

speech synthesis program

Related Searches

Python Deep Learning (13,092)

Jupyter Notebook Deep Learning (10,328)

Deep Learning Neural Network (5,801)

Deep Learning Pytorch (4,652)

Deep Learning Tensorflow (4,441)

Deep Learning Convolutional Neural Networks (3,932)

Deep Learning Computer Vision (3,735)

Network Deep Learning (3,532)

Deep Learning Artificial Intelligence (2,919)

Deep Learning Keras (2,519)

1-62 of 62 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.