Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Tokenizer | 5,084 | 42,659 | 11 | 5 months ago | 8 | November 20, 2023 | other | PHP | ||
A small library for converting tokenized PHP source code into XML (and potentially other formats) | ||||||||||
Html5gum | 140 | 2 | 4 months ago | 14 | July 26, 2023 | 13 | mit | Rust | ||
A WHATWG-compliant HTML5 tokenizer and tag soup parser | ||||||||||
Somajo | 128 | 2 | 6 | 4 months ago | 55 | September 23, 2023 | 3 | gpl-3.0 | Python | |
A tokenizer and sentence splitter for German and English web and social media texts. | ||||||||||
Splitstream | 41 | 1 | 2 | a year ago | 9 | October 05, 2022 | 3 | apache-2.0 | C | |
Continuous object splitter for C and Python | ||||||||||
Ciseau | 12 | 2 | 2 | 6 years ago | 27 | December 29, 2016 | 1 | mit | Python | |
:rocket: Tokenize and clean strings in Python | ||||||||||
Xml | 8 | 3 years ago | mit | C | ||||||
🔋 In-place lightweight XML parser | ||||||||||
Llt Tokenizer | 7 | 9 years ago | 14 | mit | Ruby | |||||
Tokenizes Latin (and Greek) texts |