Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for rust tokenizer
rust
x
tokenizer
x
27 search results found
Tokenizers
⭐
8,056
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Lindera
⭐
326
A morphological analysis library.
Vibrato
⭐
275
🎤 vibrato: Viterbi-based accelerated tokenizer
Rust Tokenizers
⭐
232
Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models
Vaporetto
⭐
206
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
Tiktoken Rs
⭐
163
Ready-made tokenizer library for working with GPT and tiktoken
Html5gum
⭐
140
A WHATWG-compliant HTML5 tokenizer and tag soup parser
Rustfst
⭐
134
Rust re-implementation of OpenFST - library for constructing, combining, optimizing, and searching weighted finite-state transducers (FSTs). A Python binding is also available.
Cang Jie
⭐
65
Chinese tokenizer for tantivy, based on jieba-rs
Lindera Tantivy
⭐
43
Lindera tokenizer for Tantivy.
Parser Tutorial
⭐
35
write your own json parser.
Sqlpop
⭐
32
SQL parser (as understood by SQLite)
Python Vibrato
⭐
25
Viterbi-based accelerated tokenizer (Python wrapper)
Maeel
⭐
21
The maeel programming language
Nlpo3
⭐
21
Thai Natural Language Processing library in Rust, with Python and Node bindings.
Python Vaporetto
⭐
17
🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.
Simplecss
⭐
14
A simple CSS 2.1 parser and selector
Xxcalc Rs
⭐
13
Embeddable or standalone robust floating-point polynomial calculator written in Rust.
Sentencepiece
⭐
12
Rust binding for the sentencepiece library
Bytepiece Rs
⭐
12
The Bytepiece Tokenizer Implemented in Rust.
Limonite
⭐
10
[WIP] Compiler for the Limonite programming language.
Bytepiece Rs
⭐
9
更纯粹、更高压缩率的Tokenizer in Rust
Nlp
⭐
9
NLP Library written in rust
Tokesies
⭐
7
A string tokenizer library for Rust
Blingfire Rs
⭐
7
Rust wrapper for the BlingFire tokenization library
Erl_tokenize
⭐
7
An Erlang source code tokenizer written in Rust.
Parsit
⭐
6
Parser-combinators library.
Related Searches
Command Line Rust (3,187)
Rust R (1,912)
Javascript Rust (1,421)
Video Game Rust (1,177)
Rust Bindings (1,155)
Typescript Rust (1,111)
Rust Language (1,068)
Python Rust (966)
Rust Blockchain (938)
C Plus Plus Rust (885)
1-27 of 27 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.