Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for c plus plus tokenizer
c-plus-plus
x
tokenizer
x
46 search results found
Sentencepiece
⭐
7,465
Unsupervised text tokenizer for Neural Network-based text generation.
Blingfire
⭐
1,685
A lightning fast Finite State machine and REgular expression manipulation library.
Text
⭐
1,093
Making text a first-class citizen in TensorFlow.
Autophrase
⭐
978
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Jumanpp
⭐
334
Juman++ (a Morphological Analyzer Toolkit)
Simple
⭐
315
支持中文和拼音的 SQLite fts5 全文搜索扩展 | A SQLite3 fts5 tokenizer which supports Chinese and PinYin
Coccoc Tokenizer
⭐
295
high performance tokenizer for Vietnamese language
Fugashi
⭐
268
A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
Tokenizer
⭐
207
Fast and customizable text tokenization library with BPE and SentencePiece support
Udpipe
⭐
198
R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Lex
⭐
142
Replaced by foonathan/lexy
Strtk
⭐
112
C++ String Toolkit Library
Hunspell
⭐
98
High-Performance Stemmer, Tokenizer, and Spell Checker for R
Simhash Cpp
⭐
94
Simhashing in C++
Alm
⭐
47
Smart Language Model
Android Sqlite Fts5 Tokenizer
⭐
33
集成了FTS5中文分词器的Sqlite3源码
Python Mecab
⭐
27
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Sentencepiece Jni
⭐
27
Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.
Trainable Tokenizer
⭐
22
Fast and trainable tokenizer for natural languages relying on maximum entropy methods.
Tokenizer
⭐
22
Convert source code into numerical tokens
Cppassist
⭐
21
C++ sanctuary for small but powerful and frequently required, stand alone features.
Mecab Ios
⭐
20
MeCab Framework for iOS 10.3 - 12.x (Japanese Parser & Tokenizer)
Mini Json Parser
⭐
19
A Tiny Json Parser
Tokenizer
⭐
18
Boost.org tokenizer module
Sphinx Jieba
⭐
18
sphinx search engine with jieba tokenizer
Tivars_lib_cpp
⭐
13
A C++ library to interact with TI-z80 (82/83/84 series) calculators files (programs, lists, matrices, etc.)
Fast Mosestokenizer
⭐
12
c++ mosestokenizer
Pog
⭐
10
C++ library for generating LALR(1) parsers
Sctokenizer
⭐
9
A Source Code Tokenizer
Boosting Tree Tokenizer
⭐
9
Gradient Boosting Dicision Tree(LightGBM)を用い、教師ありで自然言語の分かちと形態素の推定を学習&予想します。名称
Lolita
⭐
9
An experimental lexer and parser generator
Rtfreader
⭐
7
Text segmenter and tokeniser for Danish, English and other languages. Reads an RTF or flat text file and outputs the text, one line per sentence & optionally tokenized.
Jsmnpp
⭐
6
jsmn++ is a tiny json parser embedded in your C++ project for configuration.
Arduino Stringtokenizer Library
⭐
6
A very simple arduino library to use java like string-tokenizer functions to split a string with delimiters.
Cjk Tokenizer
⭐
5
Mvpl
⭐
4
The minimum viable programming language
Markargs
⭐
3
Tklgen
⭐
3
Ccpy
⭐
2
Chtholly's Compiler of PYthon
Compiler
⭐
2
Trying to make a little compiler, just for fun and learning.
String Comments Extract
⭐
2
Competitive Coding
⭐
2
Datalog Compiler
⭐
2
(2021) A compiler for Datalog code using finite state automata by Dallin Stewart
Bdcashprotocol Bdeco
⭐
2
Code source for BDCashProtocol Ecosystem
Cjk Tokenzier
⭐
2
A unigram CJK tokenizer
Afparser Library
⭐
2
The AFP Library is a collection of C++11 header files that provides users with a flexible rapid prototyping tool to create general-purpose LL(k) parsers in C++.
Related Searches
C Plus Plus Qt (8,378)
C Plus Plus Video Game (8,175)
C Plus Plus Cmake (8,010)
C Plus Plus Algorithms (6,012)
Python C Plus Plus (4,508)
C Plus Plus Opengl (4,396)
C Plus Plus Plugin (3,282)
C Plus Plus 3d Graphics (3,196)
C Plus Plus Testing (2,739)
Java C Plus Plus (2,629)
1-46 of 46 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.