Convolutionaneuralnetworkstoenhancecodedspeech

In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral domain features. The proposed postprocessors in both domains are evaluated for various narrowband and wideband speech codecs in a wide range of conditions. The proposed postprocessor improves speech quality (PESQ) by up to 0.25 MOS-LQO points for G.711, 0.30 points for G.726, 0.82 points for G.722, and 0.26 points for adaptive multirate wideband codec (AMR-WB). In a subjective CCR listening test, the proposed postprocessor on G.711-coded speech exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and obtains a clear preference of 1.77 CMOS points compared to G.711, even en par with uncoded speech.

Categories > Networking > Domain

Suggest Alternative

Stars

License

bsd-3-clause

Open Issues

n,ull

Most Recent Commit

4 years ago

Programming Language

Python

Categories

Machine Learning > Convolutional Neural Networks

Machine Learning > Keras

Machine Learning > Generative Adversarial Network

Machine Learning > Speech Processing

Site

Repo

Suggest An Alternative To ConvolutionaNeuralNetworksToEnhanceCodedSpeech

Popular Domain Projects

Full Stack Fastapi Postgresql ⭐ 14,305

Full stack, modern web application generator. Using FastAPI, PostgreSQL as database, Docker, automatic HTTPS and more.

most recent commit 3 months ago

Transferlearning ⭐ 12,494

Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习

most recent commit 3 months ago

Awesome Tunneling ⭐ 11,227

List of ngrok alternatives and other ngrok-like tunneling software and services. Focus on self-hosting.

most recent commit 3 months ago

Openmct ⭐ 11,189

A web based mission control framework.

dependent packages 2total releases 45latest release November 21, 2023most recent commit 3 months ago

Spiderfoot ⭐ 11,035

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.

most recent commit 3 months ago

Popular Speech Processing Projects

Speechbrain ⭐ 7,166

A PyTorch-based Speech Toolkit

most recent commit 3 months ago

Awesome Multimodal Ml ⭐ 5,290

Reading list for research topics in multimodal machine learning

most recent commit 3 months ago

Pyannote Audio ⭐ 4,460

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

dependent packages 13total releases 24latest release December 01, 2023most recent commit 3 months ago

Torchscale ⭐ 2,804

Foundation Architecture for (M)LLMs

dependent packages 8total releases 5latest release October 20, 2023most recent commit 3 months ago

Deepvoice3_pytorch ⭐ 1,906

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

most recent commit 4 months ago

Popular Networking Categories