Awesome Open Source

Programming Languages

Search results for python speech processing

speech-processing x

98 search results found

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

Torchscale ⭐ 2,804

Foundation Architecture for (M)LLMs

Deepvoice3_pytorch ⭐ 1,906

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Wavenet_vocoder ⭐ 1,617

WaveNet vocoder

Whisper Timestamped ⭐ 1,217

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Open source audio annotation tool for humans

Parselmouth ⭐ 961

Praat in Python, the Pythonic way

Speechpy ⭐ 828

💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

Sincnet ⭐ 764

SincNet is a neural architecture for efficiently processing raw audio samples.

Voicefixer ⭐ 735

General Speech Restoration

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

Fullsubnet ⭐ 443

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Resemble Enhance ⭐ 438

AI powered speech denoising and enhancement

Ims Toucan ⭐ 426

Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.

Speech Denoising Wavenet ⭐ 414

A neural network for end-to-end speech denoising

A python wrapper for Speech Signal Processing Toolkit (SPTK).

Nnmnkwii ⭐ 375

Library to build speech synthesis systems designed for easy and fast prototyping.

Surfboard ⭐ 369

Novoic's audio feature extraction library

🔉 spafe: Simplified Python Audio Features Extraction

Unispeech ⭐ 328

UniSpeech - Large Scale Self-Supervised Learning for Speech

Pyannote Video ⭐ 328

Face detection, tracking and clustering in videos

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

Problem Agnostic Speech Encoder

Voicefixer_main ⭐ 244

General Speech Restoration

Collection of EM algorithms for blind source separation of audio signals

Cleanunet ⭐ 231

Official PyTorch Implementation of CleanUNet (ICASSP 2022)

Neural Voice Cloning With Few Samples ⭐ 211

Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu

Speech_signal_processing_and_classification ⭐ 203

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of th

Ttslearn ⭐ 197

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

Wave U Net For Speech Enhancement ⭐ 184

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.

Gcc Nmf ⭐ 179

Real-time GCC-NMF Blind Speech Separation and Enhancement

Vq Vae Speech ⭐ 145

PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]

Soundsourceseparation ⭐ 134

The code for multi-channel source separation and dereverberation such as FastMNMF1, FastMNMF2, and AR-FastMNMF2.

Voicelab ⭐ 127

Automated Reproducible Acoustical Analysis

Elevateaipythonsdk ⭐ 111

ElevateAI - Speech-to-text API Python SDK

Tfg Voice Conversion ⭐ 109

Deep Learning-based Voice Conversion system

Whisper Auto Transcribe ⭐ 91

Auto transcribe tool based on whisper

Speechprompt ⭐ 80

**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm

Speechclip ⭐ 80

SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022

Soundstorm Pytorch ⭐ 79

Google's SoundStorm: Efficient Parallel Audio Generation

A Convolutional Recurrent Neural Network For Real Time Speech Enhancement ⭐ 79

A minimum unofficial implementation of the A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement (CRN) using PyTorch.

Awesome Spoken Language Identification ⭐ 74

An awesome spoken LID repository. (Working in progress

Discriminative Neural Clustering for Speaker Diarisation

Time delay neural network (TDNN) implementation in Pytorch using unfold method

A neural network framework for researchers studying acoustic communication

Voice Activity Detector

Nlp Guide ⭐ 61

Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.

Speechprompt V2 ⭐ 59

《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm

maracas is a library for corrupting audio files with additive and convolutive noise.

Gcommandspytorch ⭐ 54

ConvNets for Audio Recognition using Google Commands Dataset

a Wide Shelf for AI and Data Science | Resources 🍔

Keras Sincnet ⭐ 49

Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)

Voice Privacy Challenge 2022 ⭐ 45

Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software

Simpleder ⭐ 44

A lightweight library to compute Diarization Error Rate (DER).

Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob

Clarity_cc ⭐ 38

Clarity Enhancement and Prediction Challenges

Wavencoder ⭐ 36

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models with PyTorch backend.

Speech2affective_gestures ⭐ 35

This is the official implementation of the paper "Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning".

Paderbox ⭐ 35

Paderbox: A collection of utilities for audio / speech processing

A framework for automatic speech recognition

A implementation of Power Normalized Cepstral Coefficients: PNCC

Indic Num2words ⭐ 29

Python library for converting numbers to words for all Indian Languages.

Nested U Net Based Real Time Speech Enhancement Mobile App ⭐ 28

Real-time speech enhancement mobile app using Nested U-Net

Rte Speech Generator ⭐ 27

Natural Language Processing to generate new speeches for the President of Turkey.

Polyglotdb ⭐ 24

Language data store and linguistic query API

Keras-based python framework to compute phonological posterior probabilities from audio files

Image And Speech Processing ⭐ 23

Face and speech recognition by use pyqt5 face_recognition baiduai

Hifigan Denoiser ⭐ 22

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Official PyTorch implementation of "RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function" [ICASSP2024]

Novoic's linguistic feature extraction library

Robustvc ⭐ 17

**ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degradation / adversarial robustness of VC models.

Orgainzed Digital Intelligent Network (O.D.I.N)

Speech Recognition ⭐ 14

I recorded 10 voices with the same words from myself and compared them with another 10 words from another person. I was able to find a threshold level that acknowledges and recognizes my own voice.

Unsupervised speech enhancement using DVAEs

Python package implementing the TD-PSOLA algorithm for speech processing

Speech Emotion Recognition ⭐ 13

A program that uses neural networks to detect emotions from pre-recorded and real-time speech

Voice Activity Detection

Speech To Speech ⭐ 12

Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"

Multimodal Emotion eXpression Capture Amsterdam. Pipeline for capturing emotion expressions from multiple modalities (video, audio, text) in the wild.

Icrcyclegan Vc ⭐ 11

Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny

Overlap Detection ⭐ 11

Overlapped Speech detection in Multi-party Conversations

Vb_diarization ⭐ 10

VB Diarization with Eigenvoice and HMM Priors, refactored

Simple LSTM language modelling toolkit

Speech Processing & Linguistic Analysis Tool

Deep Speechgen ⭐ 9

RNN for acoustic speech generation

Speech features in python

Preprocessing Of Speech ⭐ 8

VAD + resampling | High resolution spectrogram

Bilatticernn Confidence ⭐ 7

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks https://arxiv.org/abs/1910.11933 or https://ieeexplore.ieee.org/document/9053264

Siamese network for unsupervised speech representation learning

Speechvgg ⭐ 6

The repository was moved! For the most recent version see:

Virtual Assistance For The Blind ⭐ 6

The proposed Voice-based Email System uses AI (voice commands) that will make the email system very easily accessible to visually challenged people and also help society. Accessibility is the most important feature that is considered while developing this system.

Streamcleaner ⭐ 6

Intrinsic Bayesian Algorithm for Shortwave Noise Reduction

Speech Adapters ⭐ 6

Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech understanding

Speakerdiarization ⭐ 6

Audio based speaker diarization

Mining MediaWiki dumps to create better TTS engines (using Machine Learning)

A python library for voice activity detection (VAD) for speech/non-speech segmentation.

Pydrobert Speech ⭐ 5

Speech processing with Python

A PyPI package for fast word/character error rate (WER/CER) calculation

Related Searches

Python Django (28,897)

Python Flask (17,643)

Python Machine Learning (17,461)

Python Dataset (14,792)

Python Pytorch (14,671)

Python Docker (13,757)

Python Tensorflow (13,737)

Python Command Line (13,351)

Python Deep Learning (13,092)

Python Jupyter Notebook (12,976)

1-98 of 98 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.