Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for natural language processing computational linguistics
computational-linguistics
x
natural-language-processing
x
48 search results found
Pke
⭐
1,431
Python Keyphrase Extraction module
Nlp With Ruby
⭐
1,002
Curated List: Practical Natural Language Processing done in Ruby
Pywsd
⭐
704
Python Implementations of Word Sense Disambiguation (WSD) Technologies.
Pynlpl
⭐
466
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP spec
Nlp Papers With Arxiv
⭐
363
Statistics and accepted paper list of NLP conferences with arXiv link
German Nlp
⭐
360
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
Rulm
⭐
341
Language modeling and instruction tuning for Russian
Acl Anthology
⭐
304
Data and software for building the ACL Anthology.
Pycantonese
⭐
290
Cantonese Linguistics and NLP
Nlp Conference Compendium
⭐
285
Compendium of the resources available from top NLP conferences.
Wikipron
⭐
256
Massively multilingual pronunciation mining
Bllip Parser
⭐
207
BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/bllipparser/ for Python module.
Awesome Hungarian Nlp
⭐
192
A curated list of NLP resources for Hungarian
Acl Papers
⭐
178
paper summary of Association for Computational Linguistics
Datastories Semeval2017 Task4
⭐
171
Deep-learning model presented in "DataStories at SemEval-2017 Task 4: Deep LSTM with Attention for Message-level and Topic-based Sentiment Analysis".
Thuctc
⭐
167
An Efficient Chinese Text Classifier
Compling_nlp_hse_course
⭐
157
Материалы курса по компьютерной лингвистике Школы Лингвистики НИУ ВШЭ
Colibri Core
⭐
122
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Flat
⭐
105
FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.
Amr Tutorial
⭐
100
Abstract Meaning Representation (AMR) tutorial slides
Python Tutorial Notebooks
⭐
97
Python tutorials as Jupyter Notebooks for NLP, ML, AI
Datalinguist
⭐
87
Stanford CoreNLP in idiomatic Clojure.
Ruts
⭐
85
Библиотека для извлечения статистик из текстов на русском языке.
Frog
⭐
73
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
Lamachine
⭐
66
LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilation/installation script
Ucto
⭐
60
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-s
Vecto
⭐
60
Doing things with embeddings
Folia
⭐
60
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas,
Emnlp 2023 Papers
⭐
54
EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. ⭐ support NLP!
Python_nlp_tutorial
⭐
47
This repository provides everything to get started with Python for Text Mining / Natural Language Processing (NLP)
Sentiment Analysis Of Tweets In Russian
⭐
46
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Botok
⭐
43
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
Piccl
⭐
40
A set of workflows for corpus building through OCR, post-correction and normalisation
Yap
⭐
37
Yet Another (natural language) Parser
Pylangacq
⭐
36
Language Acquisition Research Tools
Word2vec Tsne
⭐
35
Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Amr Bibliography
⭐
34
Organized inventory of research using the Abstract Meaning Representation
Cistem
⭐
33
Stemmer for German
Python Arpa
⭐
32
🐍 Python library for n-gram models in ARPA format
C2xg
⭐
29
A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars
Python Ucto
⭐
29
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).
Calamancy
⭐
26
NLP pipelines for Tagalog using spaCy
Docs
⭐
26
DELPH-IN Documentation
Elixir Nlp
⭐
25
A (hopefully helpful) collection of resources for Elixir NLP devs
Linguistics_problems
⭐
22
Natural language processing in examples and games
Lxa5
⭐
22
Linguistica 5: Unsupervised Learning of Linguistic Structure
Sentimentanalysis
⭐
22
Sentiment Analysis: Deep Bi-LSTM+attention model
Mystem Scala
⭐
21
Morphological analyzer `mystem` wrapper for JVM languages
Hades
⭐
20
Repository for the CLiPS HAte speech DEtection System [HADES].
Pybo
⭐
19
🦜 NLP for Tibetan, in Python.
Angel
⭐
19
An Ancient Greek Morphology Tagger
Wlapi
⭐
19
Ruby based API for the project Wortschatz Leipzig.
Datastories Semeval2017 Task6
⭐
19
Deep-learning model presented in "DataStories at SemEval-2017 Task 6: Siamese LSTM with Attention for Humorous Text Comparison".
Cg3
⭐
19
Tools for the 3rd edition of the Constraint Grammar formalism.
Blabla
⭐
18
Novoic's linguistic feature extraction library
Nytwit
⭐
16
New York Times Word Innovation Types dataset
Arabicprocessingcog
⭐
15
A Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
Uncertainty
⭐
14
A Python implementation of the uncertainty classifier, based on the work of Veronika Vincze.
Foliapy
⭐
14
An extensive Python library for dealing with FoLiA (Format for Linguistic Annotation) documents, a rich XML-based format for linguistic annotation finding application in Natural Language Processing (NLP). This library was formerly part of PyNLPl.
Survey
⭐
13
Survey on machine learning.
Sembei
⭐
13
🍘 単語分割を経由しない単語埋め込み 🍘
Esapp
⭐
12
An unsupervised Chinese word segmentation tool.
Arguminsci
⭐
11
Analyze Argumentation and Rhetorical Aspects in Scientific Writing.
Kurdishhunspell
⭐
10
A morphological analyzer and spell checker for Kurdish in Hunspell
Gsoc2019 Text Extraction
⭐
10
GSoC 2019: Development of a Tool for Extracting Quantitative Text Profiles
Jgtextrank
⭐
10
jgtextrank: Yet another Python implementation of TextRank
X Tagger
⭐
9
A Natural Language Processing toolkit for sequence labeling in its simplest form.
Emosense Semeval2019 Task3 Emocontext
⭐
9
Deep-learning system presented in "EmoSence at SemEval-2019 Task 3: Bidirectional LSTM Network for Contextual Emotion Detection in Textual Conversations" at SemEval-2019.
Resper
⭐
9
Computationally Modelling Resisting Strategies in Persuasive Conversations
Eth_ml
⭐
8
Projects in Machine Learning ETH team trying to use mechanical turk and active learning for solving word-sense disambiguation task
Nlp Learning Notes
⭐
8
🧠 NLP笔记,入门概念,基础知识,研究方法,顶会研读
Discoursesegmenter
⭐
8
A collection of various discourse segmenters
Foliatools
⭐
8
A number of command-line tools for working with FoLiA (Format for Linguistic Annotation). Includes validators, converters, visualisers, and more.
Phrasal Composition In Transformers
⭐
7
This repo contains datasets and code for Assessing Phrasal Representation and Composition in Transformers, by Lang Yu and Allyson Ettinger.
Structscribe
⭐
7
Resources and code for: Scalable Micro-planned Generation of Discourse from Structured Data
Kurdishcl
⭐
7
The Computational Linguistics course in Kurdish
Wlp Parser
⭐
7
This repository contains a collection of neural network models that we used to demonstrate the utility of our dataset.
Pun Model
⭐
7
Clean python implementation of the paper "Computational Model for Linguistic Humor in Puns"
Diachrony_for_russian
⭐
7
Code and dataset for tracing semantic changes in Russian adjectives
Rl3stdlib
⭐
7
The RL3 Standard Library is a collection of modules accessible to a RL3 program to simplify the programming process and removing the need to rewrite commonly used RL3 patterns and predicates.
Latent Aspect Detection
⭐
7
Code and models for the paper "Latent Aspect Detection from Online Unsolicited Customer Reviews"
Docria
⭐
6
Semi-structured Document Model (Next-generation)
Genomenlp
⭐
5
Purplemonkeydishwasher
⭐
5
A public git version of my research projects, i.e. articles and all that
Optimalnumberoftopics
⭐
5
A set of methods for finding an appropriate number of topics in a text collection
Semeval2022 Task8 Tonyx
⭐
5
Deep-learning system proposed by HFL for SemEval-2022 Task 8: Multilingual News Similarity
Docsim
⭐
5
UkrVectōrēs (former docsim) – an NLU-powered tool for knowledge discovery, classification, diagnostics and prediction. Entities similarity tool. Інструмент, "когнітивно-семантичний калькулятор", що працює на основі NLU, для виявлення, класифікації, діагностики та прогнозування знань.
Naacl Mpqa Srl4orl
⭐
5
SRL4ORL: Improving Opinion Role Labeling Using Multi-Task Learning With Semantic Role Labeling
Bolde
⭐
5
A collaborative online computational linguistics development environment.
Dependencytrees.jl
⭐
5
Dependency parsing in Julia
Text_analysis_technobabble
⭐
5
NLP Using Star Trek scripts as training data.
Memorable Quotes
⭐
5
The repository for memorable quotes project.
Related Searches
Python Natural Language Processing (7,915)
Jupyter Notebook Natural Language Processing (4,405)
Machine Learning Natural Language Processing (3,939)
Deep Learning Natural Language Processing (2,414)
Pytorch Natural Language Processing (1,212)
Dataset Natural Language Processing (1,010)
Artificial Intelligence Natural Language Processing (1,010)
Tensorflow Natural Language Processing (909)
Javascript Natural Language Processing (843)
Natural Language Processing Sentiment Analysis (839)
1-48 of 48 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.