Gtzan.keras

[REPO] Music Genre classification on GTZAN dataset using CNNs
Alternatives To Gtzan.keras
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Tensorflow Examples42,312
5 months ago218otherJupyter Notebook
TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)
Pytorch Cyclegan And Pix2pix19,434
9 days ago476otherPython
Image-to-Image Translation in PyTorch
Datasets15,58392087 hours ago52June 15, 2022529apache-2.0Python
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
First Order Model13,144
13 days ago280otherJupyter Notebook
This repository contains the source code for the paper First Order Motion Model for Image Animation
Tensor2tensor12,9968211a month ago79June 17, 2020588apache-2.0Python
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Label Studio12,35837 hours ago159June 16, 2022458apache-2.0Python
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Pix2code11,584
a month ago6apache-2.0Python
pix2code: Generating Code from a Graphical User Interface Screenshot
Fashion Mnist9,856
a year ago24mitPython
A MNIST-like fashion product database. Benchmark :point_down:
Cvat9,048
7 hours ago2September 08, 2022487mitTypeScript
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Pix2pix8,452
2 years ago76otherLua
Image-to-image translation with conditional adversarial nets
Alternatives To Gtzan.keras
Select To Compare


Alternative Project Comparisons
Readme

gtzan.keras

Music Genre classification using Convolutional Neural Networks. Implemented in Tensorflow 2.0 using the Keras API

Overview

tl;dr: Compare the classic approach of extract features and use a classifier (e.g SVM) against the Deep Learning approach of using CNNs on a representation of the audio (Melspectrogram) to extract features and classify. You can see both approaches on the nbs folder in the Jupyter notebooks.

Resume of the deep learning approach:

  1. Shuffle the input and split into train and test (70%/30%)
  2. Read the audios as melspectrograms, spliting then into 1.5s windows with 50% overlaping resulting in a dataset with shape (samples x time x frequency x channels)
  3. Train the CNN and test on test set using a Majority Voting approach

Results

To compare the result across multiple architectures, we have took two approaches for this problem: One using the classic approach of extracting features and then using a classifier. The second approach, wich is implemented on the src file here is a Deep Learning approach feeding a CNN with a melspectrogram.

You can check in the nbs folder on how we extracted the features, but here are the current results on the test set:

Model Acc
Decision Tree 0.5160
Random Forest 0.6760
ElasticNet 0.6880
Logistic Regression 0.7640
SVM (RBF) 0.7880

For the deep learning approach we have tested a simple custom architecture that can be found at the nbs folder.

Model Acc
CNN 2D 0.832

alt text alt text

Dataset

And how to get the dataset?

  1. Download the GTZAN dataset here

Extract the file in the data folder of this project. The structure should look like this:

├── data/
   ├── genres
      ├── blues
      ├── classical
      ├── country
      .
      .
      .
      ├── rock

How to run

The models are provided as .joblib or .h5 files in the models folder. You just need to use it on your custom file as described bellow.

If you want to run the training process yourself, you need to run the provided notebooks in nbs folder.

To apply the model on a test file, you need to run:

$ cd src/
$ python app.py -t MODEL_TYPE -m ../models/PATH_TO_MODEL -s PATH_TO_SONG

Where MODEL_TYPE = [ml, dl] for classical machine learning approach and for a deep learning approach, respectively.

Usage example:

$ python app.py -t dl -m ../models/custom_cnn_2d.h5 -s ../data/samples/iza_meu_talisma.mp3

and the output will be:

$ ../data/samples/iza_meu_talisma.mp3 is a pop song
$ most likely genres are: [('pop', 0.43), ('hiphop', 0.39), ('country', 0.08)]
Popular Dataset Projects
Popular Deep Learning Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Deep
Dataset
Tensorflow
Cnn
Keras