Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
D2l Zh	56,684	1	1	a month ago	51	August 18, 2023	65	apache-2.0	Python
《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
Chinese Bert Wwm	8,600			9 months ago			3	apache-2.0	Python
Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）
Qwen	8,482			3 months ago			139	apache-2.0	Python
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Nlp_chinese_corpus	8,344			a year ago			20	mit
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Text_classification	7,628			7 months ago			45	mit	Python
all kinds of text classification models and more with deep learning
Gpt2 Chinese	7,249			4 months ago			105	mit	Python
Chinese version of GPT2 training code, using BERT tokenizer.
Ansj_seg	6,390	402	17	5 months ago	10	February 15, 2018	50	apache-2.0	Java
ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典
Baichuan 7b	5,493			7 months ago			80	apache-2.0	Python
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
Awesome Chinese Llm	5,477			3 months ago
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
Huatuo Llama Med Chinese	3,776			6 months ago			14	apache-2.0	Python
Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草（原名：华驼）模型仓库，基于中文医学知识的大语言模型指令微调

Alternatives To Thuctc

Select To Compare

D2l Zh ⭐ 56,684

《动手学深度学习》：面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。

dependent packages 1total releases 51most recent commit a month ago

Chinese Bert Wwm ⭐ 8,600

Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）

most recent commit 9 months ago

Qwen ⭐ 8,482

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

most recent commit 3 months ago

Nlp_chinese_corpus ⭐ 8,344

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

most recent commit a year ago

Text_classification ⭐ 7,628

all kinds of text classification models and more with deep learning

most recent commit 7 months ago

Gpt2 Chinese ⭐ 7,249

Chinese version of GPT2 training code, using BERT tokenizer.

most recent commit 4 months ago

Ansj_seg ⭐ 6,390

ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典

dependent packages 17total releases 10most recent commit 5 months ago

Baichuan 7b ⭐ 5,493

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

most recent commit 7 months ago

Awesome Chinese Llm ⭐ 5,477

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微

most recent commit 3 months ago

Huatuo Llama Med Chinese ⭐ 3,776

Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草（原名：华驼）模型仓库，基于中文医学知识的大语言模型指令微调

most recent commit 6 months ago

Suggest An Alternative To THUCTC

Alternative Project Comparisons

Thuctc vs D2l Zh

Thuctc vs Chinese Bert Wwm

Thuctc vs Qwen

Thuctc vs Nlp_chinese_corpus

Thuctc vs Text_classification

Thuctc vs Gpt2 Chinese

Thuctc vs Ansj_seg

Thuctc vs Baichuan 7b

Thuctc vs Awesome Chinese Llm

Thuctc vs Huatuo Llama Med Chinese

Popular Chinese Projects

Iptv ⭐ 74,798

Collection of publicly available IPTV channels from all over the world

most recent commit 3 months ago

Howtocook ⭐ 57,819

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Chinese only).

dependent packages 1total releases 4latest release July 16, 2022most recent commit 3 months ago

Element ⭐ 53,857

A Vue.js 2.0 UI Toolkit for Web

dependent packages 4total releases 7latest release September 22, 2020most recent commit 3 months ago

Chinese Poetry ⭐ 45,313

The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人，21050首词。

most recent commit 5 months ago

Paddleocr ⭐ 36,076

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

dependent packages 30total releases 40latest release September 15, 2023most recent commit 3 months ago

Popular Natural Language Processing Projects

Transformers ⭐ 124,049

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

dependent packages 2,484total releases 125latest release November 15, 2023most recent commit 16 days ago

Ailearning ⭐ 37,934

AiLearning：数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2

dependent packages 2total releases 8latest release March 20, 2022most recent commit 2 months ago

Bert ⭐ 36,099

TensorFlow code and pre-trained models for BERT

dependent packages 10total releases 5latest release August 11, 2020most recent commit 6 months ago

Made With Ml ⭐ 35,496

Learn how to design, develop, deploy and iterate on production-grade ML applications.

total releases 5latest release May 15, 2019most recent commit 5 months ago

Hanlp ⭐ 32,059

中文分词词性标注命名实体识别依存句法分析成分句法分析语义依存分析语义角色标注指代消解风格转换语义相似度新词发现关键词短语提取自动摘要文本分类聚类拼音简繁转换自然语言处理

dependent packages 22total releases 43latest release February 25, 2023most recent commit a month ago

Popular Community Categories