Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
The Top 10 Corpus Open Source Projects
Open source projects categorized as Corpus
Categories
>
Data Processing
>
Corpus
Edit Category
nltk/nltk
⭐
12,699
NLTK Source
dependent packages
0
total releases
0
most recent commit
over 2 years ago
brightmart/nlp_chinese_corpus
⭐
8,344
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
dependent packages
0
total releases
0
most recent commit
about 3 years ago
nl8590687/ASRT_SpeechRecognition
⭐
7,253
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
dependent packages
0
total releases
0
most recent commit
over 2 years ago
stanfordnlp/GloVe
⭐
6,480
Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
dependent packages
0
total releases
0
most recent commit
over 2 years ago
codertimo/BERT-pytorch
⭐
5,605
Google AI 2018 BERT pytorch implementation
dependent packages
0
total releases
0
most recent commit
almost 3 years ago
ibab/tensorflow-wavenet
⭐
5,362
A TensorFlow implementation of DeepMind's WaveNet paper
dependent packages
0
total releases
0
most recent commit
almost 3 years ago
niderhoff/nlp-datasets
⭐
5,235
Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
dependent packages
0
total releases
0
most recent commit
over 3 years ago
vespa-engine/vespa
⭐
5,115
AI + Data, online. https://vespa.ai
dependent packages
0
total releases
0
most recent commit
over 2 years ago
shibing624/pycorrector
⭐
4,928
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
dependent packages
0
total releases
0
most recent commit
over 2 years ago
dariusk/corpora
⭐
4,757
A collection of small corpuses of interesting data for the creation of bots and similar stuff.
dependent packages
0
total releases
0
most recent commit
over 2 years ago
Get A Weekly Email With Trending Corpus Projects
No Spam. Unsubscribe easily at any time.
Corpus
Subscribe
Javascript must be enabled to subscribe.
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2026 Awesome Open Source. All rights reserved.