Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Nlp_chinese_corpus | 8,344 | a year ago | 20 | mit | ||||||
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP | ||||||||||
Awesome Chinese Nlp | 7,646 | 9 months ago | 3 | apache-2.0 | ||||||
A curated list of resources for Chinese NLP 中文自然语言处理相关资料 | ||||||||||
Ltp | 4,787 | 3 | 19 days ago | 46 | January 02, 2023 | 52 | Python | |||
Language Technology Platform | ||||||||||
Linly | 2,964 | 3 months ago | 107 | Python | ||||||
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集 | ||||||||||
Fastnlp | 2,940 | 2 | a year ago | 24 | October 31, 2022 | 62 | apache-2.0 | Python | ||
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation. | ||||||||||
Mnbvc | 2,533 | 3 months ago | 18 | mit | ||||||
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。 | ||||||||||
Information Extraction Chinese | 2,086 | a year ago | 118 | Python | ||||||
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取 | ||||||||||
Chinesenlp | 1,329 | 3 years ago | 3 | HTML | ||||||
Datasets, SOTA results of every fields of Chinese NLP | ||||||||||
Jcseg | 886 | 33 | 12 | 7 months ago | 13 | January 09, 2023 | 6 | apache-2.0 | Java | |
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch | ||||||||||
Chinese_models_for_spacy | 498 | 4 years ago | 8 | mit | Jupyter Notebook | |||||
SpaCy 中文模型 | Models for SpaCy that support Chinese |