Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Nlp_chinese_corpus | 8,344 | a year ago | 20 | mit | ||||||
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP | ||||||||||
Craft | 58 | 2 years ago | 1 | other | Clojure | |||||
Abacus | 42 | 6 years ago | May 24, 2021 | Go | ||||||
Counter Data structure for Golang using CountMin Sketch with a fixed amount of memory | ||||||||||
Word2vec Chinese | 34 | 4 years ago | 1 | Python | ||||||
a tutorial for training Chinese-word2vec using Wiki corpus | ||||||||||
Namu_wiki_db_preprocess | 22 | 7 years ago | apache-2.0 | Jupyter Notebook | ||||||
A python script to convert namu wiki database to huge Korean language corpus | ||||||||||
Chinese Article Classification Based On Own Corpus Via Textcnn And Gbdt | 16 | 6 years ago | 1 | Python | ||||||
中文文本分类,包含了语料库的基本处理,Wiki_zh的处理等 | ||||||||||
Spanishtransformerxl | 12 | 4 years ago | Jupyter Notebook | |||||||
Language model trained on wiki corpus (500M tokens) with fastai v1 acc>42.3% len(vocab)=60K | ||||||||||
Opiec | 12 | 5 years ago | gpl-3.0 | Java | ||||||
Reading the data from OPIEC - an Open Information Extraction corpus | ||||||||||
Wiki Dump Reader | 10 | 1 | 5 years ago | 4 | February 01, 2019 | 2 | mit | Python | ||
Extract corpora from Wikipedia dumps | ||||||||||
Wiki_zh_vec | 7 | 7 years ago | apache-2.0 | Python | ||||||
a python autotool for train Chinese wiki corpus to word embeddings using word2vec ,glove and lexvec. |