Awesome Open Source

Programming Languages

Learning_invariances_in_speech_recognition

In this work I investigate the speech command task developing and analyzing deep learning models. The state of the art technology uses convolutional neural networks (CNN) because of their intrinsic nature of learning correlated represen- tations as is the speech. In particular I develop different CNNs trained on the Google Speech Command Dataset and tested on different scenarios. A main problem on speech recognition consists in the differences on pronunciations of words among different people: one way of building an invariant model to variability is to augment the dataset perturbing the input. In this work I study two kind of augmentations: the Vocal Tract Length Perturbation (VTLP) and the Synchronous Overlap and Add (SOLA) that locally perturb the input in frequency and time respectively. The models trained on augmented data outperforms in accuracy, precision and recall all the models trained on the normal dataset. Also the design of CNNs has impact on learning invariances: the inception CNN architecture in fact helps on learning features that are invariant to speech variability using different kind of kernel sizes for convolution. Intuitively this is because of the implicit capability of the model on detecting different speech pattern lengths in the audio feature.

Categories > Machine Learning > Augmentation

Suggest Alternative

Stars

13

License

No license specified

Most Recent Commit

6 years ago

Programming Language

Python

Categories

Programming Languages > Python

Data Processing > Dataset

Machine Learning > Speech Recognition

Machine Learning > Augmentation

Alternatives To Learning_invariances_in_speech_recognition

Project Name	Stars	Downloads	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
Textattack	2,597			5	5 months ago	46	September 11, 2023	52	mit	Python
TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Keras Fcn	606				6 years ago			49	mit	Python
Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation（Unfinished）
Fixmatch	550				4 years ago			6	apache-2.0	Python
A simple method to perform semi-supervised learning with limited data.
Toolbox	347				2 years ago					Jupyter Notebook
various cv tools, such as label tools, data augmentation, label conversion, etc.
Dagan	342				2 years ago			16	mit	Python
DAGAN: Data Augmentation Generative Adversarial Networks
Cutblur	335				2 years ago			10	mit	Jupyter Notebook
Rethinking Data Augmentation for Image Super-resolution (CVPR 2020)
Copy Paste Aug	306				3 years ago			6	mit	Jupyter Notebook
Copy-paste augmentation for segmentation and detection tasks
Horizonnet	301				5 months ago			28	mit	Python
Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.
Pytorch Deep Learning Template	206				3 years ago					Jupyter Notebook
A Pytorch Computer Vision template to quick start your next project! 🚀🚀
Tensorflow In Practise Specialization	193				3 years ago			4		Jupyter Notebook
Four Courses Specialization Tensorflow in practise Specialization

Alternatives To Learning_invariances_in_speech_recognition

Select To Compare

Textattack ⭐ 2,597

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/

dependent packages 5total releases 46most recent commit 5 months ago

pypi textattack} Downloads

Keras Fcn ⭐ 606

Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation（Unfinished）

most recent commit 6 years ago

Fixmatch ⭐ 550

A simple method to perform semi-supervised learning with limited data.

most recent commit 4 years ago

Toolbox ⭐ 347

various cv tools, such as label tools, data augmentation, label conversion, etc.

most recent commit 2 years ago

DAGAN: Data Augmentation Generative Adversarial Networks

most recent commit 2 years ago

Cutblur ⭐ 335

Rethinking Data Augmentation for Image Super-resolution (CVPR 2020)

most recent commit 2 years ago

Copy Paste Aug ⭐ 306

Copy-paste augmentation for segmentation and detection tasks

most recent commit 3 years ago

Horizonnet ⭐ 301

Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.

most recent commit 5 months ago

Pytorch Deep Learning Template ⭐ 206

A Pytorch Computer Vision template to quick start your next project! 🚀🚀

most recent commit 3 years ago

Tensorflow In Practise Specialization ⭐ 193

Four Courses Specialization Tensorflow in practise Specialization

most recent commit 3 years ago

Suggest An Alternative To learning_invariances_in_speech_recognition

Alternative Project Comparisons

Learning_invariances_in_speech_recognition vs Textattack

Learning_invariances_in_speech_recognition vs Keras Fcn

Learning_invariances_in_speech_recognition vs Fixmatch

Learning_invariances_in_speech_recognition vs Toolbox

Learning_invariances_in_speech_recognition vs Dagan

Learning_invariances_in_speech_recognition vs Cutblur

Learning_invariances_in_speech_recognition vs Copy Paste Aug

Learning_invariances_in_speech_recognition vs Horizonnet

Learning_invariances_in_speech_recognition vs Pytorch Deep Learning Template

Learning_invariances_in_speech_recognition vs Tensorflow In Practise Specialization

Popular Augmentation Projects

Imgaug ⭐ 13,682

Image augmentation for machine learning experiments.

dependent packages 141total releases 11latest release February 05, 2020most recent commit 10 months ago

pypi imgaug} Downloads

Albumentations ⭐ 13,493

Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125

dependent packages 273total releases 53latest release June 10, 2023most recent commit 7 days ago

pypi albumentations} Downloads

Augmentor ⭐ 4,973

Image augmentation library in Python for machine learning.

dependent packages 8total releases 24latest release April 27, 2022most recent commit 7 months ago

pypi Augmentor} Downloads

Sketch Code ⭐ 4,714

Keras model to generate HTML code from hand-drawn website mockups. Implements an image captioning architecture to drawn source images.

most recent commit 2 years ago

Nlpaug ⭐ 3,825

Data augmentation for NLP

dependent packages 29total releases 37latest release July 07, 2022most recent commit a year ago

pypi nlpaug} Downloads

Popular Dataset Projects

Public Apis ⭐ 276,890

A collective list of free APIs

most recent commit 4 months ago

Awesome Public Datasets ⭐ 58,670

A topic-centric list of HQ open datasets.

most recent commit a month ago

Tensorflow Examples ⭐ 43,109

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

most recent commit 4 months ago

Mask_rcnn ⭐ 23,745

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

total releases 5latest release March 05, 2019most recent commit 5 months ago

Pytorch Cyclegan And Pix2pix ⭐ 21,090

Image-to-Image Translation in PyTorch

most recent commit 6 months ago

Popular Machine Learning Categories

Machine Learning

Natural Language Processing

Computer Vision

Convolutional Neural Networks

Related Searches

Python Augmentation

Python Speech Recognition

Dataset Augmentation

Dataset Speech Recognition

Get A Weekly Email With Trending Projects For These Categories

No Spam. Unsubscribe easily at any time.

Python

Dataset

Speech Recognition

Augmentation

Privacy | About | Terms | Follow Us On Twitter

Downloads, Dependent Repos, Dependent Packages, Total Releases, Latest Releases data powered by Libraries.io.

Copyright 2018-2024 Awesome Open Source. All rights reserved.