Corpusmaker

clojure utilities to build training corpora for machine learning / NLP out of public wikimedia dumps: status - partially stalled - will probably be reworked as cascalog scripts -- this project is in stalled mode right now: the pignlproc project is likely to replace it due to licensing constraints for future integration in Apache projects
Alternatives To Corpusmaker
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Csel Dev37
7 months ago11Python
Corpus Scriptorum Ecclesiasticorum Latinorum: a machine-corrected version of the public domain volumes of the monumental collection of Latin Church Fathers.
Corpusmaker14
13 years ago2epl-1.0Clojure
clojure utilities to build training corpora for machine learning / NLP out of public wikimedia dumps: status - partially stalled - will probably be reworked as cascalog scripts -- this project is in stalled mode right now: the pignlproc project is likely to replace it due to licensing constraints for future integration in Apache projects
Cltk_docker6
5 years agomitPython
Docker script for cltk
Topexapp5
a year ago22gpl-3.0JavaScript
TopExApp is a graphical user interface for the TopEx Python package. TopEx allows the exploration of topics present in a group of text documents by clustering sentences together that relay common ideas or themes.
Alternatives To Corpusmaker
Select To Compare


Alternative Project Comparisons
Popular Corpus Projects
Popular Volume Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Clojure
Volume
Corpus