Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Friso | 449 | 7 months ago | 7 | apache-2.0 | C | |||||
High performance Chinese tokenizer with both GBK and UTF-8 charset support based on MMSEG algorithm developed by ANSI C. Completely based on modular implementation and can be easily embedded in other programs, like: MySQL, PostgreSQL, PHP, etc. | ||||||||||
Microtokenizer | 119 | 3 | 1 | 3 years ago | 53 | September 28, 2021 | mit | Python | ||
一个微型&算法全面的中文分词引擎 | A micro tokenizer for Chinese | ||||||||||
Berserker | 16 | 5 years ago | 3 | mit | Python | |||||
Berserker - BERt chineSE woRd toKenizER |