Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Pytorch Cyclegan And Pix2pix | 19,434 | 9 days ago | 476 | other | Python | |||||
Image-to-Image Translation in PyTorch | ||||||||||
Deeplearningexamples | 10,428 | 9 days ago | 222 | Jupyter Notebook | ||||||
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure. | ||||||||||
Attention Is All You Need Pytorch | 7,000 | 4 months ago | 69 | mit | Python | |||||
A PyTorch implementation of the Transformer model in "Attention is All You Need". | ||||||||||
Opennmt Py | 5,948 | 2 | 7 | a day ago | 20 | September 14, 2021 | 29 | mit | Python | |
Open Source Neural Machine Translation in PyTorch | ||||||||||
Practical Pytorch | 4,272 | 2 years ago | 91 | mit | Jupyter Notebook | |||||
Go to https://github.com/pytorch/tutorials - this repo is deprecated and no longer maintained | ||||||||||
Photo2cartoon | 2,819 | a year ago | 6 | mit | Python | |||||
人像卡通化探索项目 (photo-to-cartoon translation project) | ||||||||||
Contrastive Unpaired Translation | 1,729 | 4 months ago | 79 | other | Python | |||||
Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch) | ||||||||||
Sockeye | 1,158 | 2 | 20 days ago | 80 | May 05, 2022 | 2 | apache-2.0 | Python | ||
Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch | ||||||||||
Nlp Tutorial | 836 | 3 years ago | 6 | mit | Jupyter Notebook | |||||
A list of NLP(Natural Language Processing) tutorials | ||||||||||
Attentiongan | 535 | a month ago | 15 | other | Python | |||||
AttentionGAN for Unpaired Image-to-Image Translation & Multi-Domain Image-to-Image Translation |
This is a basic implementation of attentional neural machine translation (Bahdanau et al., 2015, Luong et al., 2015) in Pytorch. It implements the model described in Luong et al., 2015, and supports label smoothing, beam-search decoding and random sampling. With 256-dimensional LSTM hidden size, it achieves 28.13 BLEU score on the IWSLT 2014 Germen-English dataset (Ranzato et al., 2015).
This codebase is used for instructional purposes in Stanford CS224N Nautral Language Processing with Deep Learning and CMU 11-731 Machine Translation and Sequence-to-Sequence Models.
nmt.py
: contains the neural machine translation model and training/testing code.vocab.py
: a script that extracts vocabulary from training datautil.py
: contains utility/helper functionsWe provide a preprocessed version of the IWSLT 2014 German-English translation task used in (Ranzato et al., 2015) [script]. To download the dataset:
wget http://www.cs.cmu.edu/~pengchey/iwslt2014_ende.zip
unzip iwslt2014_ende.zip
Running the script will extract adata/
folder which contains the IWSLT 2014 dataset.
The dataset has 150K German-English training sentences. The data/
folder contains a copy of the public release of the dataset. Files with suffix *.wmixerprep
are pre-processed versions of the dataset from Ranzato et al., 2015, with long sentences chopped and rared words replaced by a special <unk>
token. You could use the pre-processed training files for training/developing (or come up with your own pre-processing strategy), but for testing you have to use the original version of testing files, ie., test.de-en.(de|en)
.
The code is written in Python 3.6 using some supporting third-party libraries. We provided a conda environment to install Python 3.6 with required libraries. Simply run
conda env create -f environment.yml
Each runnable script (nmt.py
, vocab.py
) is annotated using dotopt
.
Please refer to the source file for complete usage.
First, we extract a vocabulary file from the training data using the command:
python vocab.py \
--train-src=data/train.de-en.de.wmixerprep \
--train-tgt=data/train.de-en.en.wmixerprep \
data/vocab.json
This generates a vocabulary file data/vocab.json
.
The script also has options to control the cutoff frequency and the size of generated vocabulary, which you may play with.
To start training and evaluation, simply run scripts/train.sh
.
After training and decoding, we call the official evaluation script multi-bleu.perl
to compute the corpus-level BLEU score of the decoding results against the gold-standard.
This work is licensed under a Creative Commons Attribution 4.0 International License.