Better Language Models and Their Implications
Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper. from openAI Blog
This repository is simple implementation GPT-2 about text-generator in Pytorch with compress code
The original repertoire is openai/gpt-2. Also You can Read Paper about gpt-2, "Language Models are Unsupervised Multitask Learners". To Understand more detail concept, I recommend papers about Transformer Model.
Good implementation GPT-2 in Pytorch which I referred to, huggingface/pytorch-pretrained-BERT, You can see more detail implementation in huggingface repository.
Transformer(Self-Attention) Paper : Attention Is All You Need(2017)
First OpenAi-GPT Paper : Improving Language Understanding by Generative Pre-Training(2018)
See OpenAI Blog about GPT-2 and Paper
$ git clone https://github.com/graykode/gpt-2-Pytorch && cd gpt-2-Pytorch # download huggingface's pytorch model $ curl --output gpt2-pytorch_model.bin https://s3.amazonaws.com/models.huggingface.co/bert/gpt2-pytorch_model.bin # setup requirements, if using mac os, then run additional setup as descibed below $ pip install -r requirements.txt
$ python main.py --text "It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions, though not quickly enough to prevent a swirl of gritty dust from entering along with him."
--text: sentence to begin with.
--quiet: not print all of the extraneous stuff like the "================"
--nsamples: number of sample sampled in batch when multinomial function use
--unconditional: If true, unconditional generation.
--batch_size: number of batch size
--length: sentence length (< number of context)
--temperature: the thermodynamic temperature in distribution
--top_k: Returns the top k largest elements of the given input tensor along a given dimension.
See more detail option about
top_k in here
$ python3 -m venv venv $ source venv/bin/activate $ pip install torch tqdm $ brew install libomp $ export LC_ALL=en_US.UTF-8 $ export LANG=en_US.UTF-8 $ pip install -r requirements.txt