Neurst

Neural end-to-end Speech Translation Toolkit
Alternatives To Neurst
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Nmt6,085
6 months ago275apache-2.0Python
TensorFlow Neural Machine Translation Tutorial
Practical Pytorch4,272
2 years ago91mitJupyter Notebook
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained
Olivia3,460
a month ago15February 26, 202125mitGo
💁‍♀️Your new best friend powered by an artificial neural network
Nlp_tasks2,904
5 years agoapache-2.0
Natural Language Processing Tasks and References
Opennmt2,334
3 years ago48mitLua
Open Source Neural Machine Translation in Torch (deprecated)
Mt Reading List2,289
8 months ago4bsd-3-clauseTeX
A machine translation reading list maintained by Tsinghua Natural Language Processing Group
Subword Nmt1,93718137 months ago8December 08, 20212mitPython
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
Openseq2seq1,393
2 years ago85apache-2.0Python
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
Seq2seq Attn1,167
2 years ago14mitLua
Sequence-to-sequence model with LSTM encoder/decoders and attention
Sockeye1,158
2a month ago80May 05, 20222apache-2.0Python
Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
Alternatives To Neurst
Select To Compare


Alternative Project Comparisons
Readme

Last Commit License Python-Version Contributors

The primary motivation of NeurST is to facilitate NLP researchers to get started on end-to-end speech translation (ST) and build advanced neural machine translation (NMT) models.

See here for a full list of NeurST examples. And we present recent progress of end-to-end ST technology at https://st-benchmark.github.io/.

NeurST is based on TensorFlow2 and we are working on the pytorch version.

NeurST News

March 29, 2022: Release of GigaST dataset: a large-scale speech translation corpus.

Aug 16, 2021: Release of models and results for IWSLT 2021 offline ST and simultaneous translation task.

June 15, 2021: Integration of LightSeq for training speedup, see the experimental branch.

March 28, 2021: The v0.1.1 release includes the instructions of weight pruning and quantization aware training for transformer models, and several more features. See the release note for more details.

Dec. 25, 2020: The v0.1.0 release includes the overall design of the code structure and recipes for training end-to-end ST models. See the release note for more details.

Highlights

  • Production ready: The model trained by NeurST can be directly exported as TF savedmodel format and use TensorFlow-serving. There is no gap between the research model and production model. Additionally, one can use LightSeq for NeurST model serving with a much lower latency.
  • Light weight: NeurST is designed specifically for end-to-end ST and NMT models, with clean and simple code. It has no dependency on Kaldi, which simplifies installation and usage.
  • Extensibility and scalability: NeurST has the careful design for extensibility and scalability. It allows users to customize Model, Task, Dataset etc. and combine each other.
  • High computation efficiency: NeurST has high computation efficiency and can be further optimized by enabling mixed-precision and XLA. Fast distributed training using Byteps / Horovod is also supported for large-scale scenarios.
  • Reliable and reproducible benchmarks: NeurST reports strong baselines with well-designed hyper-parameters on several benchmark datasets (MT&ST). It provides a series of recipes to reproduce them.

Pretrained Models & Performance Benchmarks

NeurST provides reference implementations of various models and benchmarks. Please see examples for model links and NeurST benchmark on different datasets.

Requirements and Installation

  • Python version >= 3.6
  • TensorFlow >= 2.3.0

Install NeurST from source:

git clone https://github.com/bytedance/neurst.git
cd neurst/
pip3 install -e .

If there exists ImportError during running, manually install the required packages at that time.

Citation

@InProceedings{zhao2021neurst,
  author       = {Chengqi Zhao and Mingxuan Wang and Qianqian Dong and Rong Ye and Lei Li},
  booktitle    = {the 59th Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations},
  title        = {{NeurST}: Neural Speech Translation Toolkit},
  year         = {2021},
  month        = aug,
}

Contact

Any questions or suggestions, please feel free to contact us: [email protected], [email protected].

Acknowledgement

We thank Bairen Yi, Zherui Liu, Yulu Jia, Yibo Zhu, Jiaze Chen, Jiangtao Feng, Zewei Sun for their kind help.

Popular Translation Projects
Popular Neural Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Tensorflow
Neural
Translation
Gpu
Attention