Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for text mining
text-mining
x
459 search results found
Awesome Nlp
⭐
15,935
📖 A curated list of resources dedicated to Natural Language Processing (NLP)
Textract
⭐
3,699
extract text from any document. no muss. no fuss.
Texthero
⭐
2,773
Text preprocessing, representation and visualization from zero to hero.
Trafilatura
⭐
2,447
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Scattertext
⭐
2,131
Beautiful visualizations of how language differs among document types.
Lazynlp
⭐
1,867
Library to scrape and clean web pages to create massive datasets.
Nlp Roadmap
⭐
1,618
ROADMAP(Mind Map) and KEYWORD for students those who have interest in learning NLP
Datasciencer
⭐
1,497
a curated list of R tutorials for Data Science, NLP and Machine Learning
Konlpy
⭐
1,350
Python package for Korean natural language processing.
Awesome Text Summarization
⭐
1,314
A curated list of resources dedicated to text summarization
Tidy Text Mining
⭐
1,239
Manuscript of the book "Tidy Text Mining with R" by Julia Silge and David Robinson
Tidytext
⭐
1,136
Text mining using tidy tools ✨📄✨
Autophrase
⭐
978
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Spider
⭐
907
A configurable web spider with a easy-to-use web console
Nlp In Practice
⭐
861
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
Rake Nltk
⭐
851
Python implementation of the Rapid Automatic Keyword Extraction algorithm using NLTK.
Text2vec
⭐
829
Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
Open Semantic Search
⭐
741
Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)
Nlp Notebooks
⭐
710
A collection of notebooks for Natural Language Processing from NLP Town
Listed Company News Crawl And Text Analysis
⭐
689
从新浪财经、每经网、金融界、中国证券网、证券时报网上,爬取上市公司(个股)的历史新闻文本数据进行文本
Infranodus
⭐
682
A Node.Js / Neo4J tool that translates words and relations into network graphs and shows you how it all connects.
Graphbrain
⭐
551
Language, Knowledge, Cognition
Ldavis
⭐
538
R package for web-based interactive topic model visualization.
Bigartm
⭐
537
Fast topic modeling platform
Awesome Sentiment Analysis
⭐
513
Repository with all what is necessary for sentiment analysis and related areas
Text_mining_resources
⭐
511
Resources for learning about Text Mining and Natural Language Processing
Pyshorttextcategorization
⭐
466
Various Algorithms for Short Text Mining
Rmdl
⭐
409
RMDL: Random Multimodel Deep Learning for Classification
Awesome Computational Social Science
⭐
401
A list of awesome resources for Computational Social Science
German Nlp
⭐
360
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
2018 Machinelearning Lectures Esa
⭐
324
Machine Learning Lectures at the European Space Agency (ESA) in 2018
Artificial Adversary
⭐
317
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Pyss3
⭐
307
A Python package implementing a new interpretable machine learning model for text classification (with visualization tools for Explainable AI :octocat:)
Nlpython
⭐
302
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Khcoder
⭐
295
KH Coder: for Quantitative Content Analysis or Text Mining
Malaysian Dataset
⭐
263
We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/en/latest
Medacy
⭐
260
🏥 Medical Text Mining and Information Extraction with spaCy
Nlp Labelling
⭐
256
Labelling platform for text using weak supervision.
Hdltex
⭐
252
HDLTex: Hierarchical Deep Learning for Text Classification
Fake_news_detection
⭐
251
Fake News Detection in Python
Textmining
⭐
250
Python文本挖掘系统 Research of Text Mining System
Awesome Bioie
⭐
249
🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)
Multi_rake
⭐
249
Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python
Aravec
⭐
242
AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Gwu_data_mining
⭐
228
Materials for GWU DNSC 6279 and DNSC 6290.
Nlp_profiler
⭐
227
A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Qminer
⭐
217
Analytic platform for real-time large-scale streams containing structured and unstructured data.
Cnn Text Classification Keras
⭐
204
Text Classification by Convolutional Neural Network in Keras
Udpipe
⭐
198
R package for Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing Based on the UDPipe Natural Language Processing Toolkit
Blueprints Text
⭐
198
Jupyter notebooks for our O'Reilly book "Blueprints for Text Analysis Using Python"
Shallowlearn
⭐
196
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Awesome Hungarian Nlp
⭐
192
A curated list of NLP resources for Hungarian
Tmtoolkit
⭐
191
Text Mining and Topic Modeling Toolkit for Python with parallel processing power
Breadability
⭐
191
Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
Tokenizers
⭐
170
Fast, Consistent Tokenization of Natural Language Text
Kindred
⭐
152
A Python biomedical relation extraction package that uses a supervised approach (i.e. needs training data).
Converse
⭐
147
Conversational text Analysis using various NLP techniques
Bern
⭐
146
A neural named entity recognition and multi-type normalization tool for biomedical text mining
Huspacy
⭐
145
HuSpaCy: industrial-strength Hungarian natural language processing
Awesome Text Classification
⭐
144
Awesome-Text-Classification Projects,Papers,Tutorial .
Textfeatures
⭐
143
👷♂️ A simple package for extracting useful features from character objects 👷♀️
Xioc
⭐
140
Extract indicators of compromise from text, including "escaped" ones.
Qdap
⭐
140
Quantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis
Hands On Natural Language Processing With Python
⭐
131
This repository is for my students of Udemy. You can find all lecture codes along with mentioned files for reading in here. So, feel free to clone it and if you have any problem just raise a question.
Kate
⭐
130
Code & data accompanying the KDD 2017 paper "KATE: K-Competitive Autoencoder for Text"
Sparselsh
⭐
128
A Locality Sensitive Hashing (LSH) library with an emphasis on large, highly-dimensional datasets.
Support Tickets Classification
⭐
128
This case study shows how to create a model for text analysis and classification and deploy it as a web service in Azure cloud in order to automatically classify support tickets. This project is a proof of concept made by Microsoft (Commercial Software Engineering team) in collaboration with Endava http://endava.com/en
Genius
⭐
122
Easily access song lyrics from Genius in a tibble.
Keywords2vec
⭐
120
Orange3 Text
⭐
120
🍊 📄 Text Mining add-on for Orange3
Extractnet
⭐
118
A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package
Chemdataextractor
⭐
112
Automatically extract chemical information from scientific documents
R Text Data
⭐
109
List of textual data sources to be used for text mining in R
Cogcomp Nlpy
⭐
108
CogComp's light-weight Python NLP annotators
Textminer
⭐
104
An aid for text mining in R, with a syntax that should be familiar to experienced R users. Provides a wrapper for several topic models that take similarly-formatted input and give similarly-formatted output. Has additional functionality for analyzing and diagnostics for topic models.
Text_predictor
⭐
99
Char-level RNN LSTM text generator📄.
Article Downloader
⭐
99
Uses publisher APIs to programmatically retrieve scientific journal articles for text mining.
Ruimtehol
⭐
95
R package to Embed All the Things! using StarSpace
Edsnlp
⭐
93
Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.
Text Mining Course
⭐
92
Course Notes for Text Mining - Prof. Peter Organisciak
Teanaps
⭐
92
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Lexicon
⭐
92
A data package containing lexicons and dictionaries for text analysis
Janeaustenr
⭐
91
An R Package for Jane Austen's Complete Novels 📙
Multiplex Plot
⭐
87
Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful network graph visualizations, text visualizations and more.
Intertext
⭐
86
Detect and visualize text reuse
Tf Idf Python
⭐
86
Term frequency–inverse document frequency for Chinese novel/documents implemented in python.
Btm
⭐
83
Biterm Topic Modelling for Short Text with R
Learning Social Media Analytics With R
⭐
81
This repository contains code and bonus content which will be added from time to time for the book "Learning Social Media Analytics with R" by Packt
Lisc
⭐
81
Literature Scanner: Automated collection & analyses of the scientific literature.
Lda Topic Modeling
⭐
78
A PureScript, browser-based implementation of LDA topic modeling.
Awesome Python Machine Learning Resources
⭐
77
a collection of awesome machine learning and deep learning Python libraries&tools. 热门实用机器学习和深入学习Python库和工具的集合
Trex
⭐
76
Efficient string matching with regular expressions
Jate
⭐
76
NEWS: JATE2.0 Beta.11 Released, see details below.
Ngram
⭐
70
Fast n-Gram Tokenization
Sentometrics
⭐
69
An integrated framework in R for textual sentiment time series aggregation and prediction
Pipeit
⭐
67
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Perke
⭐
67
A keyphrase extractor for Persian
Igcloud
⭐
67
*UNSUPPORTED* Use igcloud to generate Instagram Word Cloud ! 🛫 🛫 ✈ 🔝
Jstorr
⭐
66
Simple text mining of journal articles from JSTOR's Data for Research service
Hands On Python Natural Language Processing
⭐
65
1-100 of 459 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.