Awesome Open Source
Awesome Open Source

License Documentation


OpenSeq2Seq: toolkit for distributed and mixed precision training of sequence-to-sequence models

OpenSeq2Seq main goal is to allow researchers to most effectively explore various sequence-to-sequence models. The efficiency is achieved by fully supporting distributed and mixed-precision training. OpenSeq2Seq is built using TensorFlow and provides all the necessary building blocks for training encoder-decoder models for neural machine translation, automatic speech recognition, speech synthesis, and language modeling.

Documentation and installation instructions


  1. Models for:
    1. Neural Machine Translation
    2. Automatic Speech Recognition
    3. Speech Synthesis
    4. Language Modeling
    5. NLP tasks (sentiment analysis)
  2. Data-parallel distributed training
    1. Multi-GPU
    2. Multi-node
  3. Mixed precision training for NVIDIA Volta/Turing GPUs

Software Requirements

  1. Python >= 3.5
  2. TensorFlow >= 1.10
  3. CUDA >= 9.0, cuDNN >= 7.0
  4. Horovod >= 0.13 (using Horovod is not required, but is highly recommended for multi-GPU setup)


Speech-to-text workflow uses some parts of Mozilla DeepSpeech project.

Beam search decoder with language model re-scoring implementation (in decoders) is based on Baidu DeepSpeech.

Text-to-text workflow uses some functions from Tensor2Tensor and Neural Machine Translation (seq2seq) Tutorial.


This is a research project, not an official NVIDIA product.

Related resources


If you use OpenSeq2Seq, please cite this paper

    title={Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq},
    author={Oleksii Kuchaiev and Boris Ginsburg and Igor Gitman and Vitaly Lavrukhin and Jason Li and Huyen Nguyen and Carl Case and Paulius Micikevicius},
Alternatives To Openseq2seq
Select To Compare

Alternative Project Comparisons
Related Awesome Lists
Top Programming Languages

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Python (894,645
Deep Learning (39,429
Tensorflow (22,920
Neural (16,494
Translation (13,670
Recognition (10,736
Speech Recognition (1,997
Sequence To Sequence (1,367
Language Model (1,151
Machine Translation (958
Text To Speech (866
Speech To Text (804
Speech Synthesis (619
Neural Machine Translation (225
Multi Gpu (53
Multi Node (26
Float16 (21
Mixed Precision (16