Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Mustard | 686 | 6 years ago | mit | Swift | ||||||
🌭 Mustard is a Swift library for tokenizing strings when splitting by whitespace doesn't cut it. | ||||||||||
Spyglass | 378 | 10 | 1 | 9 months ago | 18 | August 14, 2023 | 13 | apache-2.0 | Java | |
A library for mentions on Android | ||||||||||
Tokenizers | 170 | 28 | 21 | a year ago | 9 | December 22, 2022 | other | R | ||
Fast, Consistent Tokenization of Natural Language Text | ||||||||||
Kr Bert | 91 | 3 years ago | 1 | Python | ||||||
KoRean based BERT pre-trained models (KR-BERT) for Tensorflow and PyTorch | ||||||||||
Sqlitesubstringsearch | 76 | 8 years ago | C | |||||||
An open source tokenizer which supports fast substring search with sqlite FTS (full text search) | ||||||||||
Lex | 55 | 2 | 1 | 6 months ago | 1 | April 26, 2015 | 1 | mit | Ruby | |
Lex is an implementation of lex tool in Ruby. | ||||||||||
Koreancharacterbert | 17 | 3 years ago | Python | |||||||
Korean BERT model using character tokenizer | ||||||||||
Rftokenizer | 17 | 3 years ago | 4 | August 19, 2021 | other | Lex | ||||
A character-wise tokenizer for morphologically rich languages | ||||||||||
Zhtml | 11 | a year ago | 4 | mit | Zig | |||||
HTML parser built in Zig | ||||||||||
Parsinghelper | 9 | 4 | 10 months ago | 37 | December 12, 2021 | mit | C# | |||
.NET text parsing helper class. |