Hmtl

🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
Alternatives To Hmtl
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Transformers87,9946491116 hours ago91June 21, 2022611apache-2.0Python
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Made With Ml32,763
9 days ago5May 15, 20198mitJupyter Notebook
Learn how to responsibly develop, deploy and maintain production machine learning applications.
D2l En16,954
10 days ago83otherPython
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 400 universities from 60 countries including Stanford, MIT, Harvard, and Cambridge.
Datasets15,620920820 hours ago52June 15, 2022527apache-2.0Python
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Awesome Pytorch List13,786
a month ago2
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
Dive Into Dl Pytorch13,747
a year ago76apache-2.0Jupyter Notebook
本项目将《动手学深度学习》(Dive into Deep Learning)原书中的MXNet实现改为PyTorch实现。
Best Of Ml Python13,088
5 days ago15cc-by-sa-4.0
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
Flair12,6022452a day ago27May 20, 202275otherPython
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Nlp Tutorial12,146
23 days ago33mitJupyter Notebook
Natural Language Processing Tutorial for Deep Learning Researchers
Allennlp11,300117674 months ago264April 14, 202294apache-2.0Python
An open-source NLP research library, built on PyTorch.
Alternatives To Hmtl
Select To Compare


Alternative Project Comparisons
Readme

HMTL (Hierarchical Multi-Task Learning model)

***** New November 20th, 2018: Online web demo is available *****

We released an online demo (along with pre-trained weights) so that you can play yourself with the model. The code for the web interface is also available in the demo folder.

To download the pre-trained models, please install git lfs and do a git lfs pull. The weights of the model will be saved in the model_dumps folder.

A Hierarchical Multi-Task Approach for Learning Embeddings from Semantic Tasks
Victor SANH, Thomas WOLF, Sebastian RUDER
Accepted at AAAI 2019

HMTL Architecture

About

HMTL is a Hierarchical Multi-Task Learning model which combines a set of four carefully selected semantic tasks (namely Named Entity Recoginition, Entity Mention Detection, Relation Extraction and Coreference Resolution). The model achieves state-of-the-art results on Named Entity Recognition, Entity Mention Detection and Relation Extraction. Using SentEval, we show that as we move from the bottom to the top layers of the model, the model tend to learn more complex semantic representation.

For further details on the results, please refer to our paper.

We released the code for training, fine tuning and evaluating HMTL. We hope that this code will be useful for building your own Multi-Task models (hierarchical or not). The code is written in Python and powered by Pytorch.

Dependecies and installation

The main dependencies are:

The code works with Python 3.6. A stable version of the dependencies is listed in requirements.txt.

You can quickly setup a working environment by calling the script ./script/machine_setup.sh. It installs Python 3.6, creates a clean virtual environment, and installs all the required dependencies (listed in requirements.txt). Please adapt the script depending on your needs.

Example usage

We based our implementation on the AllenNLP library. For an introduction to this library, you should check these tutorials.

An experiment is defined in a json configuration file (see configs/*.json for examples). The configuration file mainly describes the datasets to load, the model to create along with all the hyper-parameters of the model.

Once you have set up your configuration file (and defined custom classes such DatasetReaders if needed), you can simply launch a training with the following command and arguments:

python train.py --config_file_path configs/hmtl_coref_conll.json --serialization_dir my_first_training

Once the training has started, you can simply follow the training in the terminal or open a Tensorboard (please make sure you have installed Tensorboard and its Tensorflow dependecy before):

tensorboard --logdir my_first_training/log

Evaluating the embeddings with SentEval

We used SentEval to assess the linguistic properties learned by the model. hmtl_senteval.py gives an example of how we can create an interface between SentEval and HMTL. It evaluates the linguistic properties learned by every layer of the hiearchy (shared based word embeddings and encoders).

Data

To download the pre-trained embeddings we used in HMTL, you can simply launch the script ./script/data_setup.sh.

We did not attach the datasets used to train HMTL for licensing reasons, but we invite you to collect them by yourself: OntoNotes 5.0, CoNLL2003, and ACE2005. The configuration files expect the datasets to be placed in the data/ folder.

References

Please consider citing the following paper if you find this repository useful.

@article{sanh2018hmtl,
  title={A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks},
  author={Sanh, Victor and Wolf, Thomas and Ruder, Sebastian},
  journal={arXiv preprint arXiv:1811.06031},
  year={2018}
}
Popular Pytorch Projects
Popular Natural Language Processing Projects
Popular Machine Learning Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Pytorch
Nlp
Natural Language Processing
Embeddings