Awesome Open Source
Awesome Open Source

License Documentation


OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions


  1. Models for:
    1. Neural Machine Translation
    2. Automatic Speech Recognition
    3. Speech Synthesis
    4. Language Modeling
    5. NLP tasks (sentiment analysis)
  2. Data-parallel distributed training
    1. Multi-GPU
    2. Multi-node
  3. Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

  1. Python >= 3.5
  2. TensorFlow >= 1.10
  3. CUDA >= 9.0, cuDNN >= 7.0
  4. Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)


Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.


This is a research project, not an official NVIDIA product.

Related resources


If you use OpenSeq2Seq, please cite this paper

    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
python (53,705
deep-learning (3,923
tensorflow (2,141
speech-recognition (201
language-model (107
seq2seq (103
text-to-speech (97
speech-to-text (93
speech-synthesis (77
neural-machine-translation (49
sequence-to-sequence (30