Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for tokenizer text processing
text-processing
x
tokenizer
x
15 search results found
Hazm
⭐
1,107
Persian NLP Toolkit
Ekphrasis
⭐
583
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Open Korean Text
⭐
552
Open Korean Text Processor - An Open-source Korean Text Processor
Konoha
⭐
200
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Prenlp
⭐
105
Preprocessing Library for Natural Language Processing
Textcluster
⭐
60
短文本聚类预处理模块 Short text cluster
Tif
⭐
35
Text Interchange Formats
Text Classification Lstms Pytorch
⭐
31
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a Tweets dataset provided by Kaggle.
Python Ucto
⭐
29
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).
Python Mecab
⭐
27
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
Pnlp
⭐
25
NLP预/后处理工具。
Nlpo3
⭐
21
Thai Natural Language Processing library in Rust, with Python and Node bindings.
Arabicprocessingcog
⭐
15
A Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Hunlp
⭐
10
Hungarian NLP tools API
Rmalt
⭐
7
the malt language implemented by rbnf. https://github.com/malt-project/cmalt
1-15 of 15 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.