Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Trafilatura | 2,447 | 66 | 3 months ago | 39 | November 29, 2023 | 66 | gpl-3.0 | Python | ||
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments | ||||||||||
Autophrase | 978 | 2 years ago | 3 | November 19, 2020 | 6 | apache-2.0 | C++ | |||
AutoPhrase: Automated Phrase Mining from Massive Text Corpora | ||||||||||
German Nlp | 360 | 6 months ago | ||||||||
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German | ||||||||||
Khcoder | 295 | 3 months ago | 10 | gpl-2.0 | Perl | |||||
KH Coder: for Quantitative Content Analysis or Text Mining | ||||||||||
Malaysian Dataset | 263 | 3 months ago | 9 | apache-2.0 | Jupyter Notebook | |||||
We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/en/latest/ | ||||||||||
Awesome Hungarian Nlp | 192 | 6 months ago | 1 | |||||||
A curated list of NLP resources for Hungarian | ||||||||||
Polminer | 45 | 1 | 2 | 5 months ago | 22 | October 29, 2023 | 45 | HTML | ||
R-package for text mining with the Corpus Workbench (CWB) as backend | ||||||||||
Autophrasex | 38 | 3 years ago | 4 | May 23, 2021 | apache-2.0 | Python | ||||
Automated Phrase Mining from Massive Text Corpora in Python. | ||||||||||
Gomtch | 26 | 3 years ago | 2 | August 11, 2021 | bsd-3-clause | Go | ||||
Find text even if it doesn't want to be found | ||||||||||
Josh | 20 | 3 years ago | 2 | apache-2.0 | C | |||||
[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding |