Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for speech processing
speech-processing
x
192 search results found
Speechbrain
⭐
7,166
A PyTorch-based Speech Toolkit
Awesome Multimodal Ml
⭐
5,399
Reading list for research topics in multimodal machine learning
Pyannote Audio
⭐
4,460
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Torchscale
⭐
2,804
Foundation Architecture for (M)LLMs
Deepvoice3_pytorch
⭐
1,906
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Wavenet_vocoder
⭐
1,617
WaveNet vocoder
Awesome Diarization
⭐
1,384
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Whisper Timestamped
⭐
1,217
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Audino
⭐
988
Open source audio annotation tool for humans
Parselmouth
⭐
961
Praat in Python, the Pythonic way
Open Speech Corpora
⭐
830
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
Speechpy
⭐
828
💬 SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/
Sincnet
⭐
764
SincNet is a neural architecture for efficiently processing raw audio samples.
Voicefixer
⭐
735
General Speech Restoration
Speechalgorithms
⭐
625
Speech Algorithms
Dtln
⭐
470
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
Uspeech
⭐
452
Speech recognition toolkit for the arduino
Fullsubnet
⭐
443
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Resemble Enhance
⭐
438
AI powered speech denoising and enhancement
Speech Backbones
⭐
429
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
Ims Toucan
⭐
426
Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart. Objectives of the development are simplicity, modularity, controllability and multilinguality.
Speech Denoising Wavenet
⭐
414
A neural network for end-to-end speech denoising
Pysptk
⭐
410
A python wrapper for Speech Signal Processing Toolkit (SPTK).
Speech Resources
⭐
388
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
Neural Voice Cloning With Few Samples
⭐
379
This repository has implementation for "Neural Voice Cloning With Few Samples"
Nnmnkwii
⭐
375
Library to build speech synthesis systems designed for easy and fast prototyping.
Surfboard
⭐
369
Novoic's audio feature extraction library
Multibench
⭐
356
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
Spafe
⭐
338
🔉 spafe: Simplified Python Audio Features Extraction
Pyannote Video
⭐
328
Face detection, tracking and clustering in videos
Unispeech
⭐
328
UniSpeech - Large Scale Self-Supervised Learning for Speech
Nonautoreggenprogress
⭐
290
Tracking the progress in non-autoregressive generation (translation, transcription, etc.)
Vocgan
⭐
282
VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network
Pase
⭐
265
Problem Agnostic Speech Encoder
Voicefixer_main
⭐
244
General Speech Restoration
Pb_bss
⭐
234
Collection of EM algorithms for blind source separation of audio signals
Cleanunet
⭐
231
Official PyTorch Implementation of CleanUNet (ICASSP 2022)
Speechtransprogress
⭐
218
Tracking the progress in end-to-end speech translation
Neural Voice Cloning With Few Samples
⭐
211
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
Speech_signal_processing_and_classification
⭐
203
Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem employing speech or audio (e.g., music). Here, we are interesting in voice disorder classification. That is, to develop two-class classifiers, which can discriminate between utterances of a subject suffering from say vocal fold paralysis and utterances of a healthy subject.The mathematical modeling of th
Sptk
⭐
200
A suite of speech signal processing tools
Ttslearn
⭐
197
ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)
Wave U Net For Speech Enhancement
⭐
184
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
Gcc Nmf
⭐
179
Real-time GCC-NMF Blind Speech Separation and Enhancement
Audio Development Tools
⭐
165
This is a list of sound, audio and music development tools which contains machine learning, audio generation, audio signal processing, sound synthesis, spatial audio, music information retrieval, music generation, speech recognition, speech synthesis, singing voice synthesis and more.
React Native Dialogflow
⭐
164
A React-Native Bridge for the Google Dialogflow (API.AI) SDK
Runtimespeechrecognizer
⭐
153
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
Awesome Speech Enhancement
⭐
151
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
Zzz Retired__openstt
⭐
146
RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
Speech Enhancement
⭐
145
Deep neural network based speech enhancement toolkit
Vq Vae Speech
⭐
145
PyTorch implementation of VQ-VAE + WaveNet by [Chorowski et al., 2019] and VQ-VAE on speech signals by [van den Oord et al., 2017]
Soundsourceseparation
⭐
134
The code for multi-channel source separation and dereverberation such as FastMNMF1, FastMNMF2, and AR-FastMNMF2.
Voicelab
⭐
127
Automated Reproducible Acoustical Analysis
Elevateaijavasdk
⭐
121
Java SDK for ElevateAI
Tutorial_separation
⭐
117
This repo summarizes the tutorials, datasets, papers, codes and tools for speech separation and speaker extraction task. You are kindly invited to pull requests.
Elevateaidotnetsdk
⭐
115
.Net core 6 SDK for ElevateAI
Mevonai Speech Emotion Recognition
⭐
112
Identify the emotion of multiple speakers in an Audio Segment
Elevateaipythonsdk
⭐
111
ElevateAI - Speech-to-text API Python SDK
Tfg Voice Conversion
⭐
109
Deep Learning-based Voice Conversion system
Awesome Keyword Spotting
⭐
107
This repository is a curated list of awesome Speech Keyword Spotting (Wake-Up Word Detection).
Awesome Speech Translation
⭐
98
Whisper Auto Transcribe
⭐
91
Auto transcribe tool based on whisper
Uhv Ots Speech
⭐
90
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Vokaturi Android
⭐
83
Emotion recognition by speech in android.
Speechclip
⭐
80
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022
Speechprompt
⭐
80
**Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speech processing with prompting paradigm
Soundstorm Pytorch
⭐
79
Google's SoundStorm: Efficient Parallel Audio Generation
A Convolutional Recurrent Neural Network For Real Time Speech Enhancement
⭐
79
A minimum unofficial implementation of the A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement (CRN) using PyTorch.
Sptk
⭐
77
A modified version of Speech Signal Processing Toolkit (SPTK)
Quantumspeech Qcnn
⭐
75
IEEE ICASSP 21 - Quantum Convolution Neural Networks for Speech Processing and Automatic Speech Recognition
Awesome Spoken Language Identification
⭐
74
An awesome spoken LID repository. (Working in progress
Dnc
⭐
72
Discriminative Neural Clustering for Speaker Diarisation
Tdnn
⭐
70
Time delay neural network (TDNN) implementation in Pytorch using unfold method
Vak
⭐
67
A neural network framework for researchers studying acoustic communication
Vad
⭐
65
Voice Activity Detector
Nlp Guide
⭐
61
Natural Language Processing (NLP). Covering topics such as Tokenization, Part Of Speech tagging (POS), Machine translation, Named Entity Recognition (NER), Classification, and Sentiment analysis.
Speechprompt V2
⭐
59
《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm
Talkbox
⭐
58
Discordearsbot
⭐
56
A speech-to-text framework and bot for Discord. Take control of your Discord server using speech and voice commands. Can also be useful for hearing impaired and deaf people.
Voice2series Reprogramming
⭐
55
ICML 21 - Voice2Series: Adversarial Reprogramming Acoustic Models for Time Series Classification
Maracas
⭐
54
maracas is a library for corrupting audio files with additive and convolutive noise.
Gcommandspytorch
⭐
54
ConvNets for Audio Recognition using Google Commands Dataset
Shelf
⭐
52
a Wide Shelf for AI and Data Science | Resources 🍔
Torchsubband
⭐
51
Pytorch implementation of subband decomposition
Keras Sincnet
⭐
49
Keras (tensorflow) implementation of SincNet (Mirco Ravanelli, Yoshua Bengio - https://github.com/mravanelli/SincNet)
Voice Privacy Challenge 2020
⭐
47
Baseline Recipe for VoicePrivacy Challenge 2020: https://www.voiceprivacychallenge.org/vp2020/docs/
React Native Spokestack
⭐
46
Spokestack: give your React Native app a voice interface!
Voice Privacy Challenge 2022
⭐
45
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
Formant Analyzer
⭐
45
iOS application for finding formants in spoken sounds
Simpleder
⭐
44
A lightweight library to compute Diarization Error Rate (DER).
Awesome Asr Contextualization
⭐
42
A curated list of awesome papers on contextualizing E2E ASR outputs
Bob
⭐
40
Bob is a free signal-processing and machine learning toolbox originally developed by the Biometrics group at Idiap Research Institute, in Switzerland. - Mirrored from https://gitlab.idiap.ch/bob/bob
Clarity_cc
⭐
38
Clarity Enhancement and Prediction Challenges
Awesome Speech Emotion Recognition
⭐
36
😎 Awesome lists about Speech Emotion Recognition
Wavencoder
⭐
36
WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation, and training audio classification models with PyTorch backend.
Paderbox
⭐
35
Paderbox: A collection of utilities for audio / speech processing
Speech2affective_gestures
⭐
35
This is the official implementation of the paper "Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning".
Asr_course
⭐
34
ASR course at Chula 2018
Speeq
⭐
33
A framework for automatic speech recognition
Pncc
⭐
32
A implementation of Power Normalized Cepstral Coefficients: PNCC
1-100 of 192 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.