Bert_tokenization_for_java

This is a java version of Chinese tokenization descried in BERT.
Alternatives To Bert_tokenization_for_java
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Chinese Xinhua10,425
4 months ago30mitPython
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
Nlp_chinese_corpus8,344
a year ago20mit
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Linly2,964
3 months ago107Python
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
Mnbvc2,533
3 months ago18mit
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Information Extraction Chinese2,086
a year ago118Python
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
Thulac Python1,3411634 years ago11November 07, 202270mitPython
An Efficient Lexical Analyzer for Chinese
Chinesenlp1,329
3 years ago3HTML
Datasets, SOTA results of every fields of Chinese NLP
Zhparser627
3 months ago12otherC
zhparser is a PostgreSQL extension for full-text search of Chinese language
Thulac611
3 years ago27mitC++
An Efficient Lexical Analyzer for Chinese
Chinese_models_for_spacy498
4 years ago8mitJupyter Notebook
SpaCy 中文模型 | Models for SpaCy that support Chinese
Alternatives To Bert_tokenization_for_java
Select To Compare


Alternative Project Comparisons
Popular Chinese Projects
Popular Chinese Nlp Projects
Popular Community Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Java
Chinese
Chinese Nlp