Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for natural language processing tokenization
natural-language-processing
x
tokenization
x
25 search results found
Spacy
⭐
28,628
💫 Industrial-strength Natural Language Processing (NLP) in Python
Youtokentome
⭐
943
Unsupervised text tokenizer focused on computational efficiency
Trankit
⭐
693
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Spacy Streamlit
⭐
688
👑 spaCy building blocks and visualizers for Streamlit apps
Datacamp Python Data Science Track
⭐
655
All the slides, accompanying code and exercises all stored in this repo. 🎈
Ekphrasis
⭐
583
Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).
Php Text Analysis
⭐
484
PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language
Vibrato
⭐
275
🎤 vibrato: Viterbi-based accelerated tokenizer
Razdel
⭐
226
Rule-based token, sentence segmentation for Russian language
Tokenizer
⭐
224
Fast and customizable text tokenization library with BPE and SentencePiece support
Vaporetto
⭐
206
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer
Python_natural_language_processing
⭐
164
This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.
Vtext
⭐
110
Simple NLP in Rust with Python bindings
Simplemma
⭐
100
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
Nlp Cheat Sheet Python
⭐
98
NLP Cheat Sheet, Python, spacy, LexNPL, NLTK, tokenization, stemming, sentence detection, named entity recognition
Tweebanknlp
⭐
94
[LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweebank-NER dataset
Lima
⭐
92
The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Wordtokenizers.jl
⭐
63
High performance tokenizers for natural language processing and other related tasks
Nlpcloud Python
⭐
63
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and more...
Spacy Server
⭐
57
🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec
Mbti Personality Classifier
⭐
51
A model which uses your social media posting predict your MBTI personality type.
Tkseem
⭐
49
Arabic Tokenization Library. It provides many tokenization algorithms.
Attacut
⭐
47
A Fast and Accurate Neural Thai Word Segmenter
Wongnai Corpus
⭐
47
Collection of Wongnai's datasets
Nlpcloud Js
⭐
40
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...
Ling
⭐
39
Natural Language Processing Toolkit in Golang
Python
⭐
38
Rosette API Client Library for Python
Uax29
⭐
35
A tokenizer based on Unicode text segmentation (UAX #29), for Go. Split words, sentences and graphemes.
Textoken
⭐
31
Simple and customizable text tokenization gem.
Textoken
⭐
31
Simple and customizable text tokenization gem.
Nlp Js Tools French
⭐
29
POS Tagger, lemmatizer and stemmer for french language in javascript
Spacy_russian_tokenizer
⭐
26
Custom Russian tokenizer for spaCy
Nlpcloud Php
⭐
20
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...
Nlp Tool
⭐
19
Natural Language Processing Tool
Python Vaporetto
⭐
17
🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.
Avocado
⭐
12
AVocaDo : Strategy for Adapting Vocabulary to Downstream Domain
Natural Language Processing Fundamentals
⭐
12
Use Python and NLTK to build out your own text classifiers and solve common NLP problems
Deeplearning.ai Tensorflow_developer Specialization
⭐
11
This repo contains my work & The code base for this TensorFlow Developer specialization offered by deeplearning.AI
Models
⭐
11
Pre-trained models for tokenization, sentence segmentation and so on
Plane
⭐
11
A text processing tool including tag(HTML, URL, Email) extraction and removing, punctuation normalization, simple segmentation, and so on.
Words N Numbers
⭐
11
Tokenizing strings of text. Regex extracting arrays of words and optionally numbers, emojis, tags, usernames and email addresses from strings. For Node.js and the browser. When you need more than just [a-z] regular expressions.
Nlp_resources
⭐
10
Resources related to NLP
Nlpashto
⭐
10
Pashto Natural Language Processing Toolkit
Java
⭐
10
Rosette API Client Library for Java
Fastberttokenizer
⭐
9
Fast and memory-efficient library for WordPiece tokenization as it is used by BERT.
Tiptap Annotation Magic
⭐
9
An extension for the Tiptap editor, enabling the annotation of text. Comes with support for overlapping annotations, useful for e.g. NLP tokenization.
Text Summarizer
⭐
9
A simple experiment with text summarization in Python
Hanzinlp
⭐
9
A NLP package for Chinese text:Preprocessing, Tokenization, Chinese Fonts, Word Embeddings, Text Similarity and Sentiment Analysis 轻量级中文自然语言处理软件包
Nlpcloud Go
⭐
8
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...
Nodejs
⭐
8
Rosette API Client Library for Node.js
Nlpcloud Ruby
⭐
7
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, code generation, and much more...
Twitter Sentiment Analysis With Python
⭐
7
I aim in this project to analyze the sentiment of tweets provided from the Sentiment140 dataset by developing a machine learning sentiment analysis model involving the use of classifiers. The performance of these classifiers is then evaluated using accuracy and F1 scores.
Metacurate Lexicon
⭐
6
A web service that exposes semantic similarity search via a web GUI and a RESTful API.
Bigrams
⭐
6
Non-intrusive ngrams generations
Wikiloader
⭐
6
A package to download and preprocess a Wikipedia dump, in any language.
Taibun
⭐
6
Taiwanese Hokkien Transliterator and Tokeniser
Vietnamese Pos Tagging
⭐
5
Gán nhãn từ loại Tiếng Việt sử dụng mô hình Hidden Markov kết hợp thuật toán Viterbi
Chinese Tokenization
⭐
5
利用传统方法(N-gram,HMM等)、神经网络方法(CNN,LSTM等)和预训练方法(Bert等) word segmentation task is realized by using traditional methods (n-gram, HMM, etc.), neural network methods (CNN, LSTM, etc.) and pre training methods (Bert, etc.)】
R Binding
⭐
5
R client binding for the Rosette API
Ud Toolkit
⭐
5
NLP toolkit built around UDPipe.
Php
⭐
5
Rosette API Client Library for PHP
Related Searches
Python Natural Language Processing (7,915)
Jupyter Notebook Natural Language Processing (4,012)
Machine Learning Natural Language Processing (3,939)
Deep Learning Natural Language Processing (2,165)
Pytorch Natural Language Processing (1,097)
Artificial Intelligence Natural Language Processing (1,040)
Dataset Natural Language Processing (1,010)
Tensorflow Natural Language Processing (928)
Natural Language Processing Sentiment Analysis (833)
Javascript Natural Language Processing (796)
1-25 of 25 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.