Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Textgenrnn | 4,426 | 14 | 2 | 2 years ago | 14 | February 02, 2020 | 122 | other | Python | |
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code. | ||||||||||
Gpt 2 Simple | 3,067 | 3 | 4 | 9 months ago | 18 | October 18, 2021 | 170 | other | Python | |
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts | ||||||||||
Texar | 2,008 | 2 | 3 years ago | 5 | November 19, 2019 | 32 | apache-2.0 | Python | ||
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow | ||||||||||
Gpt2 Ml | 1,674 | 4 months ago | 22 | apache-2.0 | Python | |||||
GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型 | ||||||||||
Delta | 1,556 | 6 days ago | 3 | March 27, 2020 | 2 | apache-2.0 | Python | |||
DELTA is a deep learning based natural language and speech processing platform. | ||||||||||
Gpt2client | 341 | 2 years ago | 32 | November 13, 2019 | 7 | mit | Python | |||
✍🏻 gpt2-client: Easy-to-use TensorFlow Wrapper for GPT-2 117M, 345M, 774M, and 1.5B Transformer Models 🤖 📝 | ||||||||||
Attention Mechanisms | 294 | 2 years ago | 2 | mit | Python | |||||
Implementations for a family of attention mechanisms, suitable for all kinds of natural language processing tasks and compatible with TensorFlow 2.0 and Keras. | ||||||||||
Tensorflow_novelist | 227 | 6 years ago | 3 | Python | ||||||
模仿莎士比亚创作戏剧!屌炸天的是还能创作金庸武侠小说!快star,保持更新!! | ||||||||||
Gpt 2 Tensorflow2.0 | 218 | 9 months ago | 19 | mit | Python | |||||
OpenAI GPT2 pre-training and sequence prediction implementation in Tensorflow 2.0 | ||||||||||
Char Rnn Tf | 147 | 6 years ago | 9 | Python | ||||||
Implement character-level language models for text generation based-on LSTM, in Python/TensorFlow |
Texar is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation.
Texar was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes. A mirror of this repository is maintained by Petuum Open Source.
Builds an encoder-decoder model, with maximum likelihood learning:
import texar.tf as tx
# Data
data = tx.data.PairedTextData(hparams=hparams_data) # a dict of hyperparameters
iterator = tx.data.DataIterator(data)
batch = iterator.get_next() # get a data mini-batch
# Model architecture
embedder = tx.modules.WordEmbedder(data.target_vocab.size, hparams=hparams_emb)
encoder = tx.modules.TransformerEncoder(hparams=hparams_enc)
outputs_enc = encoder(inputs=embedder(batch['source_text_ids']), # call as a function
sequence_length=batch['source_length'])
decoder = tx.modules.TransformerDecoder(
output_layer=tf.transpose(embedder.embedding) # tie input embedding w/ output layer
hparams=hparams_decoder)
outputs, _, _ = decoder(memory=output_enc,
memory_sequence_length=batch['source_length'],
inputs=embedder(batch['target_text_ids']),
sequence_length=batch['target_length']-1,
decoding_strategy='greedy_train') # teacher-forcing decoding
# Loss for maximum likelihood learning
loss = tx.losses.sequence_sparse_softmax_cross_entropy(
labels=batch['target_text_ids'][:, 1:],
logits=outputs.logits,
sequence_length=batch['target_length']-1) # automatic sequence masks
# Beam search decoding
outputs_bs, _, _ = tx.modules.beam_search_decode(
decoder,
embedding=embedder,
start_tokens=[data.target_vocab.bos_token_id]*num_samples,
end_token=data.target_vocab.eos_token_id)
The same model, but with adversarial learning:
helper = tx.modules.GumbelSoftmaxTraingHelper( # Gumbel-softmax decoding
start_tokens=[BOS]*batch_size, end_token=EOS, embedding=embedder)
outputs, _ = decoder(helper=helper) # automatic re-use of the decoder variables
discriminator = tx.modules.BertClassifier(hparams=hparams_bert) # pre-trained model
G_loss, D_loss = tx.losses.binary_adversarial_losses(
real_data=data['target_text_ids'][:, 1:],
fake_data=outputs.sample_id,
discriminator_fn=discriminator)
The same model, but with RL policy gradient learning:
agent = tx.agents.SeqPGAgent(samples=outputs.sample_id,
logits=outputs.logits,
sequence_length=batch['target_length']-1,
hparams=config_model.agent)
Many more examples are available here
(Note: Texar>0.2.3 requires Python 3.6 or 3.7. To use with older Python versions, please use Texar<=0.2.3)
Texar requires:
tensorflow >= 1.10.0 (but < 2.0.0)
. Follow the tensorflow official instructions to install the appropriate versiontensorflow_probability >= 0.3.0 (but < 0.8.0)
. Follow the tensorflow_probability official instractions to install.After tensorflow
and tensorflow_probability
are installed, install Texar from PyPI:
pip install texar
To use cutting-edge features or develop locally, install from source:
git clone https://github.com/asyml/texar.git
cd texar
pip install .
If you use Texar, please cite the tech report with the following BibTex entry:
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Zhiting Hu, Haoran Shi, Bowen Tan, Wentao Wang, Zichao Yang, Tiancheng Zhao, Junxian He, Lianhui Qin, Di Wang, Xuezhe Ma, Zhengzhong Liu, Xiaodan Liang, Wanrong Zhu, Devendra Sachan and Eric Xing
ACL 2019
@inproceedings{hu2019texar,
title={Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation},
author={Hu, Zhiting and Shi, Haoran and Tan, Bowen and Wang, Wentao and Yang, Zichao and Zhao, Tiancheng and He, Junxian and Qin, Lianhui and Wang, Di and others},
booktitle={ACL 2019, System Demonstrations},
year={2019}
}