Transformer Tensorflow

TensorFlow implementation of 'Attention Is All You Need (2017. 6)'
Alternatives To Transformer Tensorflow
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Nlp Tutorial12,146
a month ago33mitJupyter Notebook
Natural Language Processing Tutorial for Deep Learning Researchers
6 months ago46mitPython
all kinds of text classification models and more with deep learning
7 days ago134apache-2.0Python
A TensorFlow Implementation of the Transformer: Attention Is All You Need
Spektral2,236322 days ago33April 09, 202254mitPython
Graph Neural Networks with Keras and Tensorflow 2.
2 years ago27mitPython
Graph Attention Networks (
2 years ago32mitPython
Pytorch implementation of the Graph Attention Network model by Veličković et. al (2017,
Bi Att Flow1,367
4 years ago67apache-2.0Python
Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granularity and uses a bi-directional attention flow mechanism to achieve a query-aware context representation without early summarization.
Nlp Journey1,328
2 years ago3April 29, 2020apache-2.0Python
Documents, papers and codes related to Natural Language Processing, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0.
Attention Ocr874
a year ago21April 19, 201923mitPython
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.
Tf Rnn Attention703
3 years agomitPython
Tensorflow implementation of attention mechanism for text classification tasks.
Alternatives To Transformer Tensorflow
Select To Compare

Alternative Project Comparisons

transformer hb-research

TensorFlow implementation of Attention Is All You Need. (2017. 6)



Project Structure

init Project by hb-base

├── config                  # Config files (.yml, .json) using with hb-config
├── data                    # dataset path
├── notebooks               # Prototyping with numpy or tf.interactivesession
├── transformer             # transformer architecture graphs (from input to logits)
    ├──             # Graph logic
    ├──            # Attention (multi-head, scaled_dot_product and etc..)
    ├──              # Encoder logic
    ├──              # Decoder logic
    └──                # Layers (FFN)
├──          # raw_date -> precossed_data -> generate_batch (using Dataset)
├──                 # training or test hook feature (eg. print_variables)
├──                 # define experiment_fn
└──                # define EstimatorSpec

Reference : hb-config, Dataset, experiments_fn, EstimatorSpec


  • Train and evaluate with 'WMT German-English (2016)' dataset


Can control all Experimental environment.

example: check-tiny.yml

  base_path: 'data/'
  raw_data_path: 'tiny_kor_eng'
  processed_path: 'tiny_processed_data'
  word_threshold: 1

  PAD_ID: 0
  UNK_ID: 1
  EOS_ID: 3

  batch_size: 4
  num_layers: 2
  model_dim: 32
  num_heads: 4
  linear_key_dim: 20
  linear_value_dim: 24
  ffn_dim: 30
  dropout: 0.2

  learning_rate: 0.0001
  optimizer: 'Adam'  ('Adagrad', 'Adam', 'Ftrl', 'Momentum', 'RMSProp', 'SGD')
  train_steps: 15000
  model_dir: 'logs/check_tiny'
  save_checkpoints_steps: 1000
  check_hook_n_iter: 100
  min_eval_frequency: 100
  print_verbose: True
  debug: False
  webhook_url: ""  # after training notify you using slack-webhook
  • debug mode : using tfdbg
  • check-tiny is a data set with about 30 sentences that are translated from Korean into English. (recommend read it :) )


Install requirements.

pip install -r requirements.txt

Then, pre-process raw data.

python --config check-tiny

Finally, start train and evaluate model

python --config check-tiny --mode train_and_evaluate

Or, you can use IWSLT'15 English-Vietnamese dataset.

sh                                        # download dataset
python --config iwslt15-en-vi                       # preprocessing
python --config iwslt15-en-vi --mode train_and_evalueate   # start training


After training, you can test the model.

  • command
python --config {config} --src {src_sentence}
  • example
$ python --config check-tiny --src "안녕하세요. 반갑습니다."

Source: 안녕하세요. 반갑습니다.
 > Result: Hello . I'm glad to see you . <\s> vectors . <\s> Hello locations . <\s> will . <\s> . <\s> you . <\s>

Experiments modes

✅ : Working
◽️ : Not tested yet.

  • evaluate : Evaluate on the evaluation data.
  • ◽️ extend_train_hooks : Extends the hooks for training.
  • ◽️ reset_export_strategies : Resets the export strategies with the new_export_strategies.
  • ◽️ run_std_server : Starts a TensorFlow server and joins the serving thread.
  • ◽️ test : Tests training, evaluating and exporting the estimator for a single step.
  • train : Fit the estimator using the training data.
  • train_and_evaluate : Interleaves training and evaluation.


tensorboard --logdir logs

  • check-tiny example




Dongjun Lee ([email protected])

Popular Attention Projects
Popular Tensorflow Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Deep Learning
Natural Language Processing