Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for text processing
text-processing
x
341 search results found
Command Line Text Processing
⭐
10,001
⚡ From finding text to search and replace, from sorting to beautifying text and more 🎨
Diff Match Patch
⭐
6,749
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Introduction_to_ml_with_python
⭐
6,626
Notebooks and code for the book "Introduction to Machine Learning with Python"
Sd
⭐
4,988
Intuitive find & replace CLI (sed alternative)
Pymupdf
⭐
3,908
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Fastnlp
⭐
2,940
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
Pyparsing
⭐
2,033
Python library for creating PEG parsers
Text_classification
⭐
1,621
Text Classification Algorithms: A Survey
Frangipanni
⭐
1,190
Program to convert lines of text into a tree structure.
Hazm
⭐
1,112
Persian NLP Toolkit
Lingua Go
⭐
1,064
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
Aho Corasick
⭐
865
A fast implementation of Aho-Corasick in Rust.
Hck
⭐
665
A sharp cut(1) clone.
Python Nameparser
⭐
605
A simple Python module for parsing human names into their individual components
Ekphrasis
⭐
583
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Whatlanggo
⭐
580
Natural language detection library for Go
Open Korean Text
⭐
552
Open Korean Text Processor - An Open-source Korean Text Processor
Nucleo
⭐
544
A fast and convenient fuzzy matcher library for rust
Variational Text Tensorflow
⭐
537
TensorFlow implementation of Neural Variational Inference for Text Processing
Python_basics
⭐
496
🐍 Syntax, working with Shell commands, Files, Text Processing, and more...
Pynlpl
⭐
466
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP spec
Bsed
⭐
416
Simple SQL-like syntax on top of Perl text processing.
Pyarabic
⭐
407
pyarabic
Text Dedup
⭐
399
All-in-one text de-duplication
Regex Automata
⭐
349
A low level regular expression library that uses deterministic finite automata.
Pykospacing
⭐
348
Automatic Korean word spacing with Python
Wetextprocessing
⭐
338
Text Normalization & Inverse Text Normalization
Zhon
⭐
329
Constants used in Chinese text processing
Artificial Adversary
⭐
317
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Stringi
⭐
292
Fast and portable character string processing in R (with the Unicode ICU)
Textpipe
⭐
290
Textpipe: clean and extract metadata from text
Jaconv
⭐
254
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku
Rust Unic
⭐
226
UNIC: Unicode and Internationalization Crates for Rust
Rosetta
⭐
206
Tools, wrappers, etc... for data science with a concentration on text processing
Unix4j
⭐
204
An implementation of Unix command line tools in Java.
Konoha
⭐
200
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with small changes of code.
Tmtoolkit
⭐
191
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
Textvec
⭐
190
Text vectorization tool to outperform TFIDF for classification tasks
Daachorse
⭐
186
🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure in Rust.
Matcheroni
⭐
182
A minimalist single-header library for building pattern-matchers, lexers, and parsers.
Text Detector
⭐
176
Tool which allow you to detect and translate text.
Libasciidoc
⭐
173
A Golang library for processing Asciidoc files.
Awesome Awk
⭐
170
A curated list of awesome AWK frameworks, libraries, software and resources
Pyopenjtalk
⭐
165
Python wrapper for OpenJTalk
Emoji Images
⭐
163
replace stuff like ❤️ with <img> tags of corresponding images per: http://www.emoji-cheat-sheet.com/
Dan Jurafsky Chris Manning Nlp
⭐
155
My solution to the Natural Language Processing course made by Dan Jurafsky, Chris Manning in Winter 2012.
Textrecipes
⭐
154
Extra recipes for Text Processing
Browsecloud
⭐
150
A web app to create and browse text visualizations for automated customer listening.
Japanese.js
⭐
144
Util collection for Japanese text processing. Hiraganize, Katakanize, and Romanize.
Stanza Old
⭐
141
Stanford NLP group's shared Python tools.
Xioc
⭐
140
Extract indicators of compromise from text, including "escaped" ones.
Bpl
⭐
137
Binary Processing Language
Nlpre
⭐
135
Python library for Natural Language Preprocessing (NLPre)
Padatious
⭐
132
A neural network intent parser
Support Tickets Classification
⭐
128
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Vi Rs
⭐
125
Vietnamese Input Method library
Word_cloud_fa
⭐
124
A wrapper for wordcloud module for creating Persian word clouds.
Cli_text_processing_coreutils
⭐
123
Command line text processing with GNU Coreutils
Virastar
⭐
123
Cleaning-up Persian Texts!
Colibri Core
⭐
122
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Textrude
⭐
113
Code generation from YAML/JSON/CSV models via SCRIBAN templates
Cogcomp Nlpy
⭐
108
CogComp's light-weight Python NLP annotators
Bumblebee
⭐
106
Abstract text processing and pattern matching engine in Swift. Converts text into NSAttributedStrings. Builtin markdown support.
Vim Stream
⭐
106
vims - use vim like sed
Prenlp
⭐
105
Preprocessing Library for Natural Language Processing
Learn_ruby_oneliners
⭐
101
Example based guide for text processing with ruby from the command line
Teanaps
⭐
92
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Nostril
⭐
91
Nostril: Nonsense String Evaluator
Mtp
⭐
88
Multi-lingual Text Processing
Qp Trie Rs
⭐
86
An idiomatic and fast QP-trie implementation in pure Rust.
Goarabic
⭐
85
A Go Lang package for dealing with Arabic text.
Ios11 Visionframework
⭐
84
Vision Framework IOS WWDC 2017
Learn_perl_oneliners
⭐
83
Example based guide for text processing with Perl from the command line
Kefirbb
⭐
82
A flexible Java text processor. BB, BBCode, BB-code, HTML, Textile, Markdown, parser, translator, converter.
Tokenizers
⭐
82
Elixir bindings for 🤗 Tokenizers
Node Rake
⭐
79
A NodeJS implementation of the Rapid Automatic Keyword Extraction algorithm.
Unix Text Commands
⭐
78
Unix Text Processing Command Reference
Nlp
⭐
77
Free hands-on course with the implementation (in Python) and description of several Natural Language Processing (NLP) algorithms and techniques, on several modern platforms and libraries.
Soyspacing
⭐
76
띄어쓰기 오류 교정 라이브러리입니다. CRF 와 같은 머신러닝 알고리즘이 아닌, 직관적인 접근법으로 띄어쓰기를 교정합니다.
Sliceslice Rs
⭐
75
A fast implementation of single-pattern substring search using SIMD acceleration.
Stripansi
⭐
75
A little Go package for removing ANSI color escape codes from strings.
Cso Classifier
⭐
74
Python library that classifies content from scientific papers with the topics of the Computer Science Ontology (CSO).
Go Search Replace
⭐
73
🚀 Search & replace URLs in WordPress SQL files.
Frog
⭐
73
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Lingua Franca
⭐
72
Mycroft's multilingual text parsing and formatting library
Srch
⭐
72
Text search for humans
Summarization
⭐
70
A sequence to sequence model for abstractive text summarization
Talkwithyourfiles
⭐
70
An LLM GUI application; enables you to interact with your files, offering dynamic parameters that can modify response behavior during runtime.
Text.jl
⭐
68
Numerous tools for text processing
Pipeit
⭐
67
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Perke
⭐
67
A keyphrase extractor for Persian
Hands On Python Natural Language Processing
⭐
65
Data Science
⭐
64
Data science tooling for Racket
Textcluster
⭐
60
短文本聚类预处理模块 Short text cluster
Daisydiff
⭐
59
Visual 💮 comparison of HTML in ☕ Java
Bytelines
⭐
59
Read input lines as byte slices for high efficiency
Wiki Table Scrape
⭐
59
Scrape tables from Wikipedia articles into CSVs
Tfkit
⭐
54
🤖📇 handling multiple nlp task in one pipeline
Fuzzychinese
⭐
52
A small package to fuzzy match chinese words
Python Gatenlp
⭐
51
Python text processing, pattern matching, and NLP framework
1-100 of 341 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.