Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Kagome | 769 | 27 | 4 months ago | 74 | September 27, 2023 | 4 | mit | Go | ||
Self-contained Japanese Morphological Analyzer written in pure Go | ||||||||||
Bert Japanese | 415 | 3 years ago | apache-2.0 | Jupyter Notebook | ||||||
BERT with SentencePiece for Japanese text. | ||||||||||
Nagisa | 365 | 1 | 7 | 4 months ago | 22 | July 30, 2023 | 4 | mit | Python | |
A Japanese tokenizer based on recurrent neural networks | ||||||||||
Fugashi | 339 | 39 | 4 months ago | 67 | August 25, 2023 | 5 | mit | C++ | ||
A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis. | ||||||||||
Jumanpp | 334 | a year ago | 30 | apache-2.0 | C++ | |||||
Juman++ (a Morphological Analyzer Toolkit) | ||||||||||
Sudachipy | 318 | 2 years ago | 18 | apache-2.0 | Python | |||||
Python version of Sudachi, a Japanese tokenizer. | ||||||||||
Vibrato | 275 | 1 | 4 months ago | 11 | May 12, 2023 | 3 | apache-2.0 | Rust | ||
🎤 vibrato: Viterbi-based accelerated tokenizer | ||||||||||
Vaporetto | 206 | 3 | 6 months ago | 16 | April 01, 2023 | apache-2.0 | Rust | |||
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer | ||||||||||
Konoha | 200 | 1 | 4 months ago | 10 | August 03, 2022 | mit | Python | |||
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code. | ||||||||||
Toiro | 110 | 9 months ago | 8 | July 31, 2023 | 1 | apache-2.0 | Python | |||
A comparison tool of Japanese tokenizers |