Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Polyglot | 2,212 | 65 | 28 | 5 months ago | 9 | December 15, 2021 | 166 | other | Python | |
Multilingual text (NLP) processing toolkit | ||||||||||
Elmoformanylangs | 1,325 | 1 | 1 | 4 years ago | 4 | October 15, 2020 | n,ull | mit | Python | |
Pre-trained ELMo Representations for Many Languages | ||||||||||
Contextualized Topic Models | 1,141 | 4 | 3 months ago | 30 | November 03, 2022 | 10 | mit | Python | ||
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021. | ||||||||||
Bpemb | 1,068 | 15 | 86 | 2 years ago | 13 | September 23, 2022 | 4 | mit | Python | |
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) | ||||||||||
Wit | 896 | 5 months ago | 3 | other | ||||||
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages. | ||||||||||
Detoxify | 774 | 10 | 4 months ago | 11 | December 19, 2022 | 41 | apache-2.0 | Python | ||
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic Comment Challenges. Built using ⚡ Pytorch Lightning and 🤗 Transformers. For access to our API, please email us at [email protected]. | ||||||||||
Trankit | 693 | 2 | 5 months ago | 20 | March 26, 2022 | 24 | apache-2.0 | Python | ||
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing | ||||||||||
Beto | 462 | 8 months ago | 6 | cc-by-4.0 | ||||||
BETO - Spanish version of the BERT model | ||||||||||
Autocorrect | 376 | 18 | 30 | 9 months ago | 27 | December 04, 2021 | 7 | lgpl-3.0 | Python | |
Spelling corrector in python | ||||||||||
Text2text | 268 | 3 months ago | 134 | October 21, 2023 | 27 | other | Python | |||
Text2Text: Crosslingual NLP/G toolkit |