Awesome Open Source

Programming Languages

Search results for speech synthesis

speech-synthesis x

472 search results found

The Naomi Project is an open source, technology agnostic platform for developing always-on, voice-controlled applications!

Neural Voice Cloning With Few Samples ⭐ 211

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Portaspeech ⭐ 211

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Translations with speech synthesis in your terminal as a node package

Tacotron_pytorch ⭐ 204

PyTorch implementation of Tacotron speech synthesis model.

Waveglow ⭐ 202

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

Glow Tts ⭐ 199

A Generative Flow for Text-to-Speech via Monotonic Alignment Search

Ttslearn ⭐ 197

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Fastdiff ⭐ 195

PyTorch Implementation of FastDiff (IJCAI'22)

Assem Vc ⭐ 194

Official Code for Assem-VC @ICASSP2022

Autopst ⭐ 190

Global Rhythm Style Transfer Without Text Transcriptions

Expressive_tacotron ⭐ 185

Tensorflow Implementation of Expressive Tacotron

Durian Pytorch ⭐ 181

Implementation of "Duration Informed Attention Network for Multimodal Synthesis" paper in PyTorch.

Intellinode ⭐ 177

Access the latest AI models like ChatGPT, LLaMA, Diffusion, Gemini Hugging face, and beyond through a unified prompt layer and performance evaluation

NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment

Refurbished Arduino version of the Talkie library from Peter Knight.

Phaseaug ⭐ 166

ICASSP 2023 Accepted

Universalvocoding ⭐ 165

A PyTorch implementation of "Robust Universal Neural Vocoding"

Audio Development Tools ⭐ 165

This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.

Cotatron ⭐ 163

Official code for Cotatron @ INTERSPEECH 2020

Ueazspeech ⭐ 162

This plugin integrates Azure Speech Cognitive Services in Unreal Engine.

PPG-Based Voice Conversion

Ukrainian Tts ⭐ 161

Ukrainian TTS (text-to-speech) using ESPNET

Tensorvox ⭐ 160

Desktop application for neural speech synthesis written in C++

React Speech Kit ⭐ 159

React hooks for Speech Recognition and Speech Synthesis

Meta Tts ⭐ 151

Official repository of https://doi.org/10.1109/TASLP.2022.3167258. More up-to-date code is in "refactor" branch.

Voiceflow Tts ⭐ 147

This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Comprehensive Transformer Tts ⭐ 146

A Non-Autoregressive Transformer based TTS, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS.

Neural Hmm ⭐ 143

Neural HMMs are all you need (for high-quality attention-free TTS)

Nix Tts ⭐ 138

🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation

M.i.t.s.u.h.a. ⭐ 134

World's First Multilingual Inexpensive Therapeutic Sophisticated Ultra-responsive Holographic Agent. In simple terms, an AI you can talk to and it'll talk back with a body using VTube Studio.

Legacy_straight ⭐ 133

A vocoder framework which had been widely used in research community since 1999.

Pink Trombone ⭐ 132

A programmable version of Neil Thapen's Pink Trombone

Summertts ⭐ 132

SummerTTS 是一个基于C++的独立编译的中文和英文语音合成项目，可以本地运行不需要网络，而且没有额外的依赖，一键 is a standalone Chinese and English speech synthesis(TTS) project that has almost no dependency and could be easily used for Chinese TTS with just one key build out

Emotional Speech Data ⭐ 130

This is the GitHub page for publicly available emotional speech data.

Whisper Vits Japanese ⭐ 128

Vits Japanese with Whisper as data processor (you can train your VITS even you only have audios)

This repo contains the official PyTorch implementation of "Audio Super Resolution in the Spectral Domain" (ICASSP 2023)

Awesome Ai Services ⭐ 127

An overview of the AI-as-a-service landscape

Easy Speech ⭐ 127

Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API

Bigvsan ⭐ 126

Pytorch implementation of BigVSAN

Fac Via Ppg ⭐ 125

Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)

Spokestack Python ⭐ 124

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Catch A Waveform ⭐ 122

Official pytorch implementation of the paper: "Catch-A-Waveform: Learning to Generate Audio from a Single Short Example" (NeurIPS 2021)

Ml With Audio ⭐ 118

HF's ML for Audio study group

Cross Modal Perceptionist ⭐ 116

CVPR 2022: Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?

Waveflow ⭐ 115

A PyTorch implementation of "WaveFlow: A Compact Flow-based Model for Raw Audio" (ICML 2020)

Learning Blazor ⭐ 114

The application for the "Learning Blazor: Build Single Page Apps with WebAssembly and C#" O'Reilly Media book by David Pine.

Comospeech ⭐ 112

CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

Articulate.js ⭐ 112

A jQuery plugin that lets the browser speak to you.

A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder

Prosody ⭐ 104

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

Cross Speaker Emotion Transfer ⭐ 104

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech

Stylespeech ⭐ 103

Official implementation of Meta-StyleSpeech and StyleSpeech

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)

Msmc Tts ⭐ 100

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Stylespeech ⭐ 100

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Awesome Speech Translation ⭐ 98

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Awesome Russian Speech ⭐ 97

Russian speech technology links

Manim Voiceover ⭐ 96

Manim plugin for all things voiceover

Software Automatic Mouth - Tiny Speech Synthesizer

Awesome Singing Voice Synthesis And Singing Voice Conversion ⭐ 94

A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works (such as Music Synthesis, Automatic Music Transcription, Automatic MOS Prediction, SSL-based ASR...etc).

Vits Mandarin Biaobei ⭐ 91

application of vits on mandarin tts

Tacotron Pytorch ⭐ 90

A Pytorch Implementation of Tacotron: End-to-end Text-to-speech Deep-Learning Model

Speechsynthesis ⭐ 89

음성합성 관련 자료 모음

Voicesmith ⭐ 87

[WIP] VoiceSmith makes training text to speech models easy.

Idiolect ⭐ 87

🎙️ Handsfree Audio Development Interface

Unofficial PyTorch Implementation of UnivNet Vocoder (https://arxiv.org/abs/2106.07889)

Diffsinger ⭐ 85

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

Pytorch Dc Tts ⭐ 85

Text to Speech with PyTorch (English and Mongolian)

Chatterbot Voice ⭐ 84

A example of verbal communication using ChatterBot

Cross_vc ⭐ 83

Cross-lingual Voice Conversion

Parallel Tacotron2 ⭐ 80

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Soundstorm Pytorch ⭐ 79

Google's SoundStorm: Efficient Parallel Audio Generation

Unicats Ctx Vec2wav ⭐ 79

[AAAI 2024] Code for CTX-vec2wav in UniCATS

Istftnet Pytorch ⭐ 76

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Styletts Vc ⭐ 76

Official Implementation of StyleTTS-VC

Emotionalconversionstargan ⭐ 75

This repository contains code to replicate results from the ICASSP 2020 paper "StarGAN for Emotional Speech Conversion: Validated by Data Augmentation of End-to-End Emotion Recognition".

Ssl_speech_restoration ⭐ 75

SelfRemaster: SSL Speech Restoration

Rvc Tts Webui ⭐ 75

Text-to-Speech Gradio webui using RVC and edge-tts

Neural Text To Speech Di Dart Cepat Ringan Tanpa Koneksi Internet dan bisa berjalan di cpu

Unsuperior Ai Waifu ⭐ 73

AI waifu that can run on your phone or PC

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform

Tdmelodic ⭐ 71

A Japanese accent dictionary generator

Multi Singer ⭐ 71

PyTorch Implementation of Multi-Singer (ACM-MM'21)

Comprehensive E2e Tts ⭐ 71

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS

Cnn_vocoder ⭐ 69

A fast cnn-based vocoder

Speech_ai ⭐ 68

Speech to speech bot built with Python

Persian Tts Coqui ⭐ 65

Persian/Farsi text to speech(TTS) training using coqui tts

Pink Trombone ⭐ 65

Pink Trombone exhibit by Neil Thapen

Wavegrad2 ⭐ 63

Unofficial Pytorch Implementation of WaveGrad2

Adaspeech ⭐ 62

AdaSpeech: Adaptive Text to Speech for Custom Voice

Crystal - C++ implementation of a unified framework for multilingual TTS synthesis engine with SSML specification as interface.

Nlp Guide ⭐ 61

Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.

Intellijava ⭐ 61

Integrate with the latest language models, image generation, speech, and deep learning frameworks like ChatGPT, DALL·E, and Cohere using few java lines.

Tf Wavenet_vocoder ⭐ 59

Wavenet and its applications with Tensorflow

Diffwave Sr ⭐ 57

Gst Tacotron ⭐ 57

Reproducing Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis (https://arxiv.org/pdf/1803.09017.pdf)

An 16kHz implementation of HiFi-GAN for soft-vc.

101-200 of 472 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.