Awesome Open Source
Awesome Open Source

Build Status


Tool for Aligning lyrics to audio automatically using a phonetic recognizer with Hidden Markov Models. The Viterbi Decoding with explicit durations of reference syllables can be toggled on with the parameter WITH_DURATIONS

Built from scratch. Alternatively one can use this tool as a wrapper around htk (may be faster) by setting the parameter DECODE_WITH_HTK

If you are using this work please cite

NOTE: A version building upon this research is built by Voice Magix. It features

  • latest deep-learning enabled acoustic model
  • English language lyrics parser and normalizer
  • runtime speed optimization
  • option to run on recordings with diverse types of background instruments
  • reduced external package dependencies

If interested in using it write to info at voicemagix dot com

Folder Structure

  • example: example/test sound and annotation files
  • scripts: help scripts for running the code (including on hpc cluster )
  • src: main source code
    • align: main alignment logic
    • hmm: hidden Markov model alignment
    • for_makam: Makam-specific logic (see music traditions below)
    • models_makam: acoustic model for Turkish
    • models_jingju: acoustic model for Jingju Mandarin
    • for_jingju: jingju-specific logic (see music traditions below)
    • onsets: logic for note-onset-aware alignment (ISMIR 2016)
    • parse: logic for parsing lyrics files
    • smstools: modifications to the
    • utilsLyrics: any utility scripts
  • test: test scripts (scould be used in CI)
  • thrash: code that has to be reviewed and deleted, left for the sake of completeness


Copyright 2014-2017 Music Technology Group - Universitat Pompeu Fabra

AlignmentDuration is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation (FSF), either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see

For more details see COPYING.txt


NOTE: python3 is not supported and tested

git clone; sudo apt-get install python-dev python-setuptools python-numpy

pip install -r requirements; python install

cd ..; git clone

install also Theano

git clone; cd htkModelParser; sudo pip install -r requirements; python install

git clone; sudo apt-get install python-scipy; python install

cd ..; git clone


Georgi Dzhambazov, Knowledge-based Probabilistic Modeling for Tracking Lyrics in Music Audio Signals, PhD thesis thesis materials companion page

USAGE on different music traditions

jingju (Beijing Opera) : Chinese

python AlignmentDuration/jingju/ 2 0 /JingjuSingingAnnotation-master/lyrics2audio/results/3folds/ 3 0

to test: python AlignmentDuration/test/

with method testLyricsAlign_mandarin_pop

Turkish Makam music: Turkish

You need to provide the musicbrainz ID (MBID) of the recording. This requirement could be removed on demand...

call as a method from an aggregator API:

install; python pycompmusic/compmusic/extractors/makam/

or locally:


to test: python AlignmentDuration/test/

with method testLyricsAlignMakam


Write to georgi.dzhambazov at upf dot edu or info at voicemagix dot com if you would like to use the English language model. It is not included here for licensing issues.


Use evalAccuracy script. 100 means perfect alignment. Usually values above 80% are acceptably well for human listeners.

The default evaluation level is set at word boundaries


git clone git checkout for_pycompmusic

cd /homedtic/georgid/test2/AlignmentDuration source /homedtic/georgid/env/bin/activate python install

to test: python /homedtic/georgid/test2/AlignmentDuration/test/

on server: git pull /srv/dunya/env/src/pycompmusic/compmusic/extractors/makam/ with recording MB-ID: 727cff89-392f-4d15-926d-63b2697d7f3f

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
python (54,564
deep-learning (3,994
music (595
neural-networks (440
research (204
synchronization (90
signal-processing (74
decoding (57
alignment (55
lyrics (44
music-information-retrieval (29
duration (18