Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for corpus language model
corpus
x
language-model
x
45 search results found
Nlp_chinese_corpus
⭐
8,344
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Bert Pytorch
⭐
5,605
Google AI 2018 BERT pytorch implementation
Clue
⭐
3,345
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Bertweet
⭐
542
BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Chatbot_data
⭐
293
Chatbot_data_for_Korean
Lotclass
⭐
231
[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Parsbert
⭐
222
🤗 ParsBERT: Transformer-based Model for Persian Language Understanding
Mathpile
⭐
192
Generative AI for Math: MathPile
Robbert
⭐
180
A Dutch RoBERTa-based language model
Transformer Lm
⭐
155
Transformer language model (GPT-2) with sentencepiece tokenizer
Pyclue
⭐
128
Python toolkit for Chinese Language Understanding(CLUE) Evaluation benchmark
Jlm
⭐
99
A fast LSTM Language Model for large vocabulary language like Japanese and Chinese
Arabic Bert
⭐
80
Arabic edition of BERT pretrained language models
Greek Bert
⭐
74
A Greek edition of BERT pre-trained language model
Kneser Ney
⭐
61
Kneser-Ney implementation in Python
Vietnamese Electra
⭐
59
Electra pre-trained model using Vietnamese corpus
Pretraining For Language Understanding
⭐
59
Pre-training of Language Models for Language Understanding
Insuranceqa_zh
⭐
53
InsuranceQA models for Chinese corpus in insurance fields.
Deep Nlp Resources
⭐
41
Curated list of all NLP Resources
Autoencode
⭐
40
AutoenCODE is a Deep Learning infrastructure that allows to encode source code fragments into vector representations, which can be used to learn similarities.
Generating Text Small Corpus
⭐
29
Generating style-specific text from a small corpus of 2.5k sentences using a pre-trained language model. Code in PyTorch
Herbert
⭐
29
HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.
Cfgen
⭐
27
Parse a text corpus and generate sentences in the same style using context-free grammar combined with a Markov chain.
Russian Ulmfit
⭐
27
AWD-LSTM language model trained on newspaper corpora with fast.ai
Cryptokcodecracker
⭐
27
Running Key Cipher Decoder + other classic cipher decoders. Automatically discovers likely solutions using an NGram language model.
Relm_unmt
⭐
26
Python source code for EMNLP 2020 paper "Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT".
Belgpt2
⭐
23
🇧🇪 BelGPT-2: a GPT-2 model pre-trained on French corpora.
Mytwitterbot
⭐
20
A Twitter bot powered by a Recurrent Neural Network (RNN)
En Az Parallel Corpus
⭐
18
English-Azerbaijani parallel language corpus
Languagetool Neural Network
⭐
16
Edit Unsup Ts
⭐
15
This repo contains the code for our paper "Iterative Edit-Based Unsupervised Sentence Simplification" accepted at ACL 2020.
Thailmcut
⭐
15
Opus Api
⭐
14
OPUS (opus.nlpl.eu) Python3 API
Lt1
⭐
13
Course on Language Technologies and NLP
German2vec
⭐
13
Language Model and Text Classification for German Language using Deep Learning
Arabert
⭐
12
Arabic Language Model based on Bert
Spanishtransformerxl
⭐
12
Language model trained on wiki corpus (500M tokens) with fastai v1 acc>42.3% len(vocab)=60K
Albert Mongolian
⭐
11
ALBERT trained on Mongolian text corpus
Ngram Language Model
⭐
11
An implementation of a HMM Ngram language model.
Langdist
⭐
10
Multilingual Language Modeling Toolkit
Language Modeling
⭐
9
Language modeling on the Penn Treebank (PTB) corpus using a trigram model with linear interpolation, a neural probabilistic language model, and a regularized LSTM.
Token Rnn Tensorflow
⭐
8
Multi-layer Recurrent Neural Networks (LSTM, RNN) for token-level language models in Python using Tensorflow
Deepitalian
⭐
7
Neural_language_model_bangla
⭐
7
A neural language model trained from the bangla wiki corpus
Neural Probabilistic Language Model
⭐
6
Implemented using tensorflow.
Trumptweets
⭐
5
analyze trump's nonsense, feed in a topic, and generate a new tweet based on a custom corpus.
Cachemodelpackage
⭐
5
Instructions and Sample Corpora for applying Language Models to Code
Noisy Channel Spell Checker
⭐
5
A tool for correcting misspellings in textual input using the Noisy Channel Model.
Related Searches
Python Corpus (2,447)
Python Language Model (540)
Natural Language Processing Corpus (510)
Dataset Corpus (342)
Java Corpus (308)
Language Corpus (261)
1-45 of 45 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.