Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Fakenewscorpus | 184 | 4 years ago | 2 | apache-2.0 | ||||||
A dataset of millions of news articles scraped from a curated list of data sources. | ||||||||||
Open Discourse | 64 | a year ago | 14 | mit | Python | |||||
Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag). | ||||||||||
Corpus Db | 38 | 4 years ago | 18 | gpl-3.0 | Jupyter Notebook | |||||
A textual corpus database for the digital humanities. | ||||||||||
Polyglotdb | 24 | 2 years ago | 36 | July 28, 2017 | 27 | mit | Python | |||
Language data store and linguistic query API | ||||||||||
Namu_wiki_db_preprocess | 22 | 7 years ago | apache-2.0 | Jupyter Notebook | ||||||
A python script to convert namu wiki database to huge Korean language corpus | ||||||||||
Hn Data Dumps | 18 | a year ago | mit | Python | ||||||
Workbench | 17 | 7 years ago | 11 | mit | Java | |||||
Java and Lucene based tools for BitFunnel corpus preparation | ||||||||||
Glossa | 16 | 8 years ago | 4 | mit | JavaScript | |||||
Ruby on Rails application that uses the Rails version of the Glossa system for corpus search and results management (https://github.com/textlab/rglossa). Includes a Dockerfile for constructing a Docker image containing the application (see https://docker.com). | ||||||||||
Dact | 13 | 3 years ago | 13 | lgpl-2.1 | C++ | |||||
Decaffeinated Alpino Corpus Tool | ||||||||||
Coquery | 13 | 2 years ago | 27 | gpl-3.0 | Python | |||||
Coquery is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpus. |