Herbert

HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.
Alternatives To Herbert
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Nlp_chinese_corpus8,344
a year ago20mit
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Bert Pytorch5,605
19 months ago5October 23, 201863apache-2.0Python
Google AI 2018 BERT pytorch implementation
Clue3,345
a year ago73Python
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
Bertweet542
4 months agomitPython
BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Chatbot_data293
a year agomit
Chatbot_data_for_Korean
Lotclass231
2 years agoapache-2.0Python
[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Parsbert222
a year ago6apache-2.0Jupyter Notebook
🤗 ParsBERT: Transformer-based Model for Persian Language Understanding
Mathpile192
4 months ago2apache-2.0JavaScript
Generative AI for Math: MathPile
Robbert180
5 months ago15mitJupyter Notebook
A Dutch RoBERTa-based language model
Transformer Lm155
3 years ago8Python
Transformer language model (GPT-2) with sentencepiece tokenizer
Alternatives To Herbert
Select To Compare


Alternative Project Comparisons
Popular Corpus Projects
Popular Language Model Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Corpus
Language Model
Tokenizer