Bytenet Tensorflow

ByteNet for character-level language modelling
Alternatives To Bytenet Tensorflow
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Transformers88,46364911a day ago91June 21, 2022618apache-2.0Python
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Bert33,57713115 days ago5August 11, 2020868apache-2.0Python
TensorFlow code and pre-trained models for BERT
D2l En16,954
12 days ago83otherPython
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 400 universities from 60 countries including Stanford, MIT, Harvard, and Cambridge.
Datasets15,6339208a day ago52June 15, 2022532apache-2.0Python
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
8 months ago20otherJupyter Notebook
Your new Mentor for Data Science E-Learning.
Best Of Ml Python13,088
7 days ago15cc-by-sa-4.0
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
Nlp Tutorial12,146
25 days ago33mitJupyter Notebook
Natural Language Processing Tutorial for Deep Learning Researchers
2 days ago227Jupyter Notebook
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Stanford Tensorflow Tutorials10,215
2 years ago88mitPython
This repository contains code examples for the Stanford's course: TensorFlow for Deep Learning Research.
a year ago7
Alternatives To Bytenet Tensorflow
Select To Compare

Alternative Project Comparisons


Join the chat at

This is a tensorflow implementation of the byte-net model from DeepMind's paper Neural Machine Translation in Linear Time.

From the abstract

The ByteNet decoder attains state-of-the-art performance on character-level language modeling and outperforms the previous best results obtained with recurrent neural networks. The ByteNet also achieves a performance on raw character-level machine translation that approaches that of the best neural translation models that run in quadratic time. The implicit structure learnt by the ByteNet mirrors the expected alignments between the sequences.

ByteNet Encoder-Decoder Model:

Model architecture

Image Source - Neural Machine Translation in Linear Time paper

The model applies dilated 1d convolutions on the sequential data, layer by layer to obain the source encoding. The decoder then applies masked 1d convolutions on the target sequence (conditioned by the encoder output) to obtain the next character in the target sequence.The character generation model is just the byteNet decoder, while the machine translation model is the combined encoder and decoder.

Implementation Notes

  1. The character generation model is defined in ByteNet/ and the translation model is defined in ByteNet/ ByteNet/ contains the bytenet residual block, dilated conv1d and layer normalization.
  2. The model can be configured by editing
  3. Number of residual channels 512 (Configurable in


  • Python 2.7.6
  • Tensorflow 1.2.0


  • The character generation model has been trained on Shakespeare text. I have included the text file in the repository Data/generator_training_data/shakespeare.txt.
  • The machine translation model has been trained for german to english translation. You may download the news commentary dataset from here


Create the following directories Data/tb_summaries/translator_model, Data/tb_summaries/generator_model, Data/Models/generation_model, Data/Models/translation_model.

  • Text Generation

    • Configure the model by editing
    • Save the text files to train on, in Data/generator_training_data. A sample shakespeare.txt is included in the repo.
    • Train the model by : python --text_dir="Data/generator_training_data"
    • python --help for more options.
  • Machine Translation

    • Configure the model by editing
    • Save the source and target sentences in separate files in Data/MachineTranslation. You may download the new commentary training corpus using this link.
    • The model is trained on buckets of sentence pairs of length in mutpiples of a configurable parameter bucket_quant. The sentences are padded with a special character beyond the actual length.
    • Train translation model using:
      • python --source_file=<source sentences file> --target_file=<target sentences file> --bucket_quant=50
      • python --help for more options.

Generating Samples

  • Generate new samples using :
    • python --seed="SOME_TEXT_TO_START_WITH" --sample_size=<SIZE OF GENERATED SEQUENCE>
  • You can test sample translations from the dataset using python
    • This will pick random source sentences from the dataset and translate them.

Sample Generations

What say you to this part of this to thee?

What say these faith, madam?

First Citizen:
The king of England, the will of the state,
That thou dost speak to me, and the thing that shall
In this the son of this devil to the storm,
That thou dost speak to thee to the world,
That thou dost see the bear that was the foot,

Translation Results to be updated


  • Evaluating the translation Model
  • Implement beam search - Contributors welcomed. Currently the model samples from the probability distribution from the top k most probable predictions.


Popular Tensorflow Projects
Popular Natural Language Processing Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Natural Language Processing
Deep Neural Networks
Machine Translation