| several27/FakeNewsCorpus |
184 |
|
0 |
0 |
over 6 years ago |
0 |
|
2 |
apache-2.0 |
|
| A dataset of millions of news articles scraped from a curated list of data sources. |
| open-discourse/open-discourse |
64 |
|
0 |
0 |
over 3 years ago |
0 |
|
14 |
mit |
Python |
| Open Discourse is the first fully comprehensive corpus of the plenary proceedings of the federal German Parliament (Bundestag). |
| JonathanReeve/corpus-db |
38 |
|
0 |
0 |
over 6 years ago |
0 |
|
18 |
gpl-3.0 |
Jupyter Notebook |
| A textual corpus database for the digital humanities. |
| MontrealCorpusTools/PolyglotDB |
24 |
|
0 |
0 |
almost 4 years ago |
36 |
July 28, 2017 |
27 |
mit |
Python |
| Language data store and linguistic query API |
| insikk/namu_wiki_db_preprocess |
22 |
|
0 |
0 |
about 9 years ago |
0 |
|
0 |
apache-2.0 |
Jupyter Notebook |
| A python script to convert namu wiki database to huge Korean language corpus |
| ashish01/hn-data-dumps |
18 |
|
0 |
0 |
over 3 years ago |
0 |
|
0 |
mit |
Python |
| BitFunnel/Workbench |
17 |
|
0 |
0 |
over 9 years ago |
0 |
|
11 |
mit |
Java |
| Java and Lucene based tools for BitFunnel corpus preparation |
| textlab/glossa |
16 |
|
0 |
0 |
over 10 years ago |
0 |
|
4 |
mit |
JavaScript |
| Ruby on Rails application that uses the Rails version of the Glossa system for corpus search and results management (https://github.com/textlab/rglossa). Includes a Dockerfile for constructing a Docker image containing the application (see https://docker.com). |
| gkunter/coquery |
13 |
|
0 |
0 |
almost 4 years ago |
0 |
|
27 |
gpl-3.0 |
Python |
| Coquery is a free corpus query tool for linguists, lexicographers, translators, and anybody who wishes to search and analyse a text corpus. |
| rug-compling/dact |
13 |
|
0 |
0 |
almost 5 years ago |
0 |
|
13 |
lgpl-2.1 |
C++ |
| Decaffeinated Alpino Corpus Tool |