Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Flaml | 2,496 | 4 | a day ago | 68 | June 17, 2022 | 165 | mit | Jupyter Notebook | ||
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP. | ||||||||||
Bert_language_understanding | 886 | 4 years ago | 9 | Python | ||||||
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN | ||||||||||
Bert Multi Label Text Classification | 761 | 2 months ago | 40 | mit | Python | |||||
This repo contains a PyTorch implementation of a pretrained BERT model for multi-label text classification. | ||||||||||
Transformers Tutorials | 678 | 4 months ago | 17 | mit | Jupyter Notebook | |||||
Github repo with tutorials to fine tune transformers for diff NLP tasks | ||||||||||
Azureml Bert | 374 | a year ago | 22 | mit | Jupyter Notebook | |||||
End-to-End recipes for pre-training and fine-tuning BERT using Azure Machine Learning Service | ||||||||||
Adaptnlp | 371 | 2 years ago | 23 | November 10, 2021 | 7 | apache-2.0 | Jupyter Notebook | |||
An easy to use Natural Language Processing library and framework for predicting, training, fine-tuning, and serving up state-of-the-art NLP models. | ||||||||||
Bert4doc Classification | 353 | 2 years ago | 12 | apache-2.0 | Python | |||||
Code and source for paper ``How to Fine-Tune BERT for Text Classification?`` | ||||||||||
Multifit | 269 | 3 years ago | 13 | mit | Jupyter Notebook | |||||
The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model Fine-tuning" https://arxiv.org/abs/1909.04761 | ||||||||||
Sentimentanalysis | 260 | a year ago | 2 | mit | Python | |||||
Sentiment analysis neural network trained by fine-tuning BERT, ALBERT, or DistilBERT on the Stanford Sentiment Treebank. | ||||||||||
Kogpt2 Finetuning | 210 | 20 days ago | 3 | apache-2.0 | Python | |||||
🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥 |
The field of NLP was revolutionized in the year 2018 by introduction of BERT and his Transformer friends(RoBerta, XLM etc.).
These novel transformer based neural network architectures and new ways to training a neural network on natural language data introduced transfer learning to NLP problems. Transfer learning had been giving out state of the art results in the Computer Vision domain for a few years now and introduction of transformer models for NLP brought about the same paradigm change in NLP.
Companies like Google and Facebook trained their neural networks on large swathes of Natural Language Data to grasp the intricacies of language thereby generating a Language model. Finally these models were fine tuned to specific domain dataset to achieve state of the art results for a specific problem statement. They also published these trained models to open source community. The community members were now able to fine tune these models to their specific use cases.
Hugging Face made it easier for community to access and fine tune these models using their Python Package: Transformers.
Despite these amazing technological advancements applying these solutions to business problems is still a challenge given the niche knowledge required to understand and apply these method on specific problem statements. Hence, In the following tutorials i will be demonstrating how a user can leverage technologies along with some other python tools to fine tune these Language models to specific type of tasks.
Before i proceed i will like to mention the following groups for the fantastic work they are doing and sharing which have made these notebooks and tutorials possible:
Please review these amazing sources of information and subscribe to their channels/sources.
The problem statements that i will be working with are:
Notebook | Github Link | Colab Link | Kaggle Kernel |
---|---|---|---|
Text Classification: Multi-Class | Github | Kaggle | |
Text Classification: Multi-Label | Github | Kaggle | |
Sentiment Classification with Experiment Tracking in WandB! | Github | ||
Named Entity Recognition: with TPU processing! | Github | Kaggle | |
Question Answering | |||
Summary Writing: with Experiment Tracking in WandB! | Github | Kaggle |
data
: This folder contains all the toy data used for fine tuning.utils
: This folder will contain any miscellaneous script used to prepare for the fine tuning.models
: Folder to save all the artifacts post fine tuning.I will try to cover the practical and implementation aspects of fine tuning of these language models on various NLP tasks. You can improve your knowledge on this topic by reading/watching the following resources.