Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for natural language processing wikipedia
natural-language-processing
x
wikipedia
x
22 search results found
Sling
⭐
1,873
SLING - A natural language frame semantics parser
Wikipedia2vec
⭐
899
A tool for learning vector representations of words and entities from Wikipedia
Wit
⭐
896
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.
Wordninja
⭐
648
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.
Chakin
⭐
313
Simple downloader for pre-trained word vectors
Adam_qas
⭐
298
ADAM - A Question Answering System. Inspired from IBM Watson
Aravec
⭐
242
AraVec is a pre-trained distributed word representation (word embedding) open source project which aims to provide the Arabic NLP research community with free to use and powerful word embedding models.
Spikex
⭐
220
SpikeX - SpaCy Pipes for Knowledge Extraction
Nlp Data Augmentation
⭐
215
Data Augmentation for NLP. NLP数据增强
Wp2txt
⭐
160
A command-line toolkit to extract text content and category data from Wikipedia dump files
Qb
⭐
160
QANTA Quiz Bowl AI
Sling
⭐
143
SLING - A natural language frame semantics parser
Quantulum3
⭐
112
Library for unit extraction - fork of quantulum for python3
Wpcorpus
⭐
98
wpcorpus - NLP corpus based on Wikipedia's full article dump
Quantulum
⭐
92
Python library for information extraction of quantities from unstructured text
Doc2vec Api
⭐
92
document embedding and machine learning script for beginners
Sift
⭐
91
Knowledge extraction from web data
Ambigqa
⭐
86
An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"
Ja.text8
⭐
74
Japanese text8 corpus for word embedding.
Text Segmentation
⭐
73
Implementation of the paper: Text Segmentation as a Supervised Learning Task
Wiki Split
⭐
72
One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.
Nlp Corpus
⭐
65
varied english texts for modern NLP testing
Wiki Atomic Edits
⭐
47
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
Codex
⭐
46
CoDEx: A set of knowledge graph Completion Datasets Extracted from Wikidata and Wikipedia
Jawiki Kana Kanji Dict
⭐
44
Generate SKK/MeCab dictionary from Wikipedia(Japanese edition)
Modern_chinese_nlp
⭐
37
(WIP) My humble contribution to the democratization of the Chinese NLP technology
Mitie_chinese_wikipedia_corpus
⭐
35
Pre-trained Wikipedia corpus by MITIE
Chinese Wikipedia Corpus Creator
⭐
33
Corpus creator for Chinese Wikipedia
Arabic Tagger
⭐
31
AQMAR Arabic Tagger: Sequence tagger with cost-augmented structured perceptron training
Contextuallstm
⭐
26
Contextual LSTM for NLP tasks like word prediction and word embedding creation for Deep Learning
Odia Nlp Resource Catalog
⭐
26
Ml You Can Use
⭐
24
Practical ML and NLP with examples.
Bot
⭐
24
Python Bot using RASA for NLP
Arabic Word Embeddings Word2vec
⭐
19
Arabic Word Embeddings Word2vec
Wikirec
⭐
18
Recommendation engine framework based on Wikipedia data
Politbert
⭐
18
Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good model.
Gpt2_episode_summary_generator
⭐
18
Utilizing webscraping and state-of-the-art NLP to generate TV show episode summaries.
Word2vec Wikification Py
⭐
16
Disambiguation of wikipedia article name
Text Summarization
⭐
15
Using Spacy and NLTK module with Tf-Idf algorithm for text-summarisation. This code will give you the summary of inputted article. You can input text directly or from .txt file, .pdf file or from wikipedia url.
German Wikipedia Text Corpus
⭐
15
This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings like fastText or ELMo Deep contextualized word representations.
Infotabs Code
⭐
14
Implementation of the semi-structured inference model in our ACL 2020 paper, INFOTABS: Inference on Tables as Semi-structured Data.
Pyconhk2015 Chinese Nlp
⭐
14
Materials for the talk on Chinese NLP at PyCon HK 2015
German2vec
⭐
13
Language Model and Text Classification for German Language using Deep Learning
Opiec
⭐
12
Reading the data from OPIEC - an Open Information Extraction corpus
Lemmer
⭐
12
English Lemmer interface for Node.js
Knowledge_infotabs
⭐
11
Repository containing code for the NAACL 2021 paper (Incorporating External Knowledge to Enhance Tabular Reasoning)
Bilingualcorpus
⭐
11
Wiki Dump Reader
⭐
10
Extract corpora from Wikipedia dumps
Wned
⭐
10
A sytem for Named Entity Disambiguation based on Random Walks and Learning to Rank.
Squad
⭐
10
Entity_knowledge_in_bert
⭐
10
This repository contains the code for the CONLL 2019 paper "Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking". The code is provided as a documentation for the paper and also for follow-up research.
Koshik
⭐
9
An NLP framework for large scale processing using Hadoop
Fool Me Twice
⭐
9
Game code and data for Fool Me Twice: Entailment from Wikipedia Gamification https://arxiv.org/abs/2104.04725
Entitypedia
⭐
9
Entitypedia is an Extended Named Entity Dictionary from Wikipedia.
Text Vectorian
⭐
9
Wiki Text Nlp
⭐
8
Extract 'Did you know?' facts from Wikipedia articles
11411 Project
⭐
8
11-411 NLP Project: Wikipedia Article Q&A System
Expanda
⭐
8
The universal integrated corpus-building environment.
Kwiki
⭐
7
Minimal parser for Wikipedia pages with zero dependencies
Opiec Pipeline
⭐
7
Wikipedia2corpus
⭐
7
Wikipedia text corpus for self-supervised NLP model training
Language_model_tf
⭐
6
Language Model in Tensorflow
Wikiloader
⭐
6
A package to download and preprocess a Wikipedia dump, in any language.
Rosettepedia
⭐
6
Augment Rosette API entity extraction results with information from Wikipedia.
Wikitextcorpusdownloader
⭐
6
A Language Independent Wikipedia Text Corpus Downloader
Ereina
⭐
6
Language rules for Persian texts
Tf Similar Sentences
⭐
6
Find similar sentences using Tensorflow Hub for English Wikipedia
Wikitrivia
⭐
5
A trivia game based on NLP-extracted Wikipedia questions
A1 Summit
⭐
5
An All-in-1 summarizer for your news articles, blogs, YouTube videos, study materials, Wikipedia content, etc.
Nepali Nlp Resources
⭐
5
Resources for Nepali Natural Language Processing
Related Searches
Python Natural Language Processing (7,915)
Jupyter Notebook Natural Language Processing (4,405)
Machine Learning Natural Language Processing (3,939)
Deep Learning Natural Language Processing (2,414)
Python Wikipedia (1,264)
Pytorch Natural Language Processing (1,212)
Artificial Intelligence Natural Language Processing (1,010)
Dataset Natural Language Processing (1,010)
Tensorflow Natural Language Processing (909)
Javascript Natural Language Processing (843)
1-22 of 22 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.