Tf Speech Recognition Challenge Solution

Source code of the model used in Tensorflow Speech Recognition Challenge (https://www.kaggle.com/c/tensorflow-speech-recogn The solution ranked in top 5% in private leaderboard.
Alternatives To Tf Speech Recognition Challenge Solution
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Face Api.js15,377
a month ago438mitTypeScript
JavaScript API for face detection and face recognition in the browser and nodejs with tensorflow.js
Asrt_speechrecognition6,942
10 days ago1October 23, 2020101gpl-3.0Python
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Lstm Human Activity Recognition3,074
a year ago19mitJupyter Notebook
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier
Automatic_speech_recognition2,743
2 years ago69mitPython
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Tensorflow Speech Recognition2,124
a year ago32otherPython
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Zh Ner Tf1,761
4 years ago67Python
A very simple BiLSTM-CRF model for Chinese Named Entity Recognition 中文命名实体识别 (TensorFlow)
Labelbox Custom Labeling Apps1,732
5 months ago40apache-2.0JavaScript
Explore example custom labeling apps built with Labelbox SDK
Sequence_tagging1,725
4 years ago15apache-2.0Python
Named Entity Recognition (LSTM + CRF) - Tensorflow
Alpr Unconstrained1,462
2 years ago106otherC
License Plate Detection and Recognition in Unconstrained Scenarios
Lip Reading Deeplearning1,433
4 years ago1apache-2.0Python
:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Alternatives To Tf Speech Recognition Challenge Solution
Select To Compare


Alternative Project Comparisons
Readme

TF Speech Recognition Challenge

Tensorflow Speech Recognition Challenge was a Kaggle competition organised by Google Brain to use the Speech Commands Dataset to build an algorithm that understands simple spoken commands. https://www.kaggle.com/c/tensorflow-speech-recognition-challenge

This solution achieved a rank of 63 on private leaderboard (top 5%).

Project Structure

  • data
    • raw
      • train (Training audio files)
      • test (Test audio files used for evaluation
  • libs
    • classification (All scripts used for training and evaluation)
  • notebooks
  • scripts (Executable scripts)
  • models (Pretrained Models)

Requirements

  1. Tensorflow 1.4
  2. librosa
  3. scikit-learn
  4. Python 3.x

Running

Download the Speech Commands Dataset and extract the dataset in the train folder. Test Audio can be placed in data/test/audio folder.

The notebooks can be run individually using Jupyter. To run the scripts from command line edit the notebooks using Jupyter and run:

./script/execute_notebook.py

and select the notebook to run. The results are stored in results/notebook_name.log

P0 Predict Test WAV.ipynb can be used to predict audio files using a trained graphdef model.

Architecture

Models used

  1. A variant of Convolutional LSTM (https://arxiv.org/pdf/1610.00277.pdf)
  2. LSTM-L (https://arxiv.org/pdf/1711.07128.pdf)
  3. C-RNN (https://arxiv.org/pdf/1711.07128.pdf)
  4. GRU-L (https://arxiv.org/pdf/1711.07128.pdf)
  5. Resnet

Training

The model was trained using a GCP instance with the following specifications:

  • NVIDIA Tesla P100 X 1
  • 16 GB RAM
  • 35 GB SSD

Most of the models converged in 30k steps. Pseudo Labelling on test data was used to improve the model performance.

Prediction

The final model was a ensemble 13 models. Weighted Averaging and Stacking was used to generate the final predictions.

Aknowledgements

  1. ML-KWS-for-MCU (ARM-software/ML-KWS-for-MCU)
  2. Very Deep Convolutional Neural Network for Robust Speech Recognition (https://arxiv.org/pdf/1610.00277.pdf)
  3. Speech Commands Dataset (https://research.googleblog.com/2017/08/launching-speech-commands-dataset.html)

If you like this project or have any queries don't hesitate to send an email to [email protected]

Popular Recognition Projects
Popular Tensorflow Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Jupyter Notebook
Deep Learning
Tensorflow
Raspberry Pi
Neural Network
Convolutional Neural Networks
Recognition
Recurrent Neural Networks
Scikit Learn
Speech Recognition
Ensemble Learning