Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for corpus embeddings
corpus
x
embeddings
x
98 search results found
Laser
⭐
3,460
Language-Agnostic SEntence Representations
Glove Python
⭐
1,171
Toy Python implementation of http://www-nlp.stanford.edu/projects/glove/
Awesome Persian Nlp Ir
⭐
658
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Ngram2vec
⭐
638
Four word embedding models implemented in Python. Supporting arbitrary context features
Ner Lstm
⭐
528
Named Entity Recognition using multilayered bidirectional LSTM
Undreamt
⭐
421
Unsupervised Neural Machine Translation
Embedding
⭐
309
한국어 임베딩 (Sentence Embeddings Using Korean Corpora)
Rel
⭐
279
REL: Radboud Entity Linker
Spanish Word Embeddings
⭐
248
Spanish word embeddings computed with different methods and from different corpora
Sensegram
⭐
211
Making sense embedding out of word embeddings using graph-based word sense induction
Awesome Nlp Polish
⭐
169
A curated list of resources dedicated to Natural Language Processing (NLP) in polish. Models, tools, datasets.
Wordgcn
⭐
167
ACL 2019: Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks
Word_embeddings
⭐
112
Code for the blog post "Making Sense of Word2vec"
Sentencerepresentation
⭐
112
Embeddingdynamicstereotypes
⭐
101
Dict2vec
⭐
88
Dict2vec is a framework to learn word embeddings using lexical dictionaries.
Sadedegel
⭐
81
A General Purpose NLP library for Turkish
Wvec
⭐
65
Word vectors
Nlp For Hindi
⭐
59
State of the Art Language models and Classifier for Hindi language (spoken in Indian sub-continent)
Ner Pt
⭐
54
Portuguese Named Entity Recognition
Turkish Glove
⭐
51
Türkçe GloVe - Repository for Turkish GloVe Word Embeddings
Redbud Tree Depression
⭐
48
scripts to model depression in speech and text
Flair Lms
⭐
47
Language Models for Zalando's flair library
Bicvm
⭐
42
BiCVM Code
Ewiser
⭐
40
A Word Sense Disambiguation system integrating implicit and explicit external knowledge.
Autoencode
⭐
40
AutoenCODE is a Deep Learning infrastructure that allows to encode source code fragments into vector representations, which can be used to learn similarities.
Word2vec On Wikipedia
⭐
39
A pipeline for training word embeddings using word2vec on wikipedia corpus.
Text Classification Cn
⭐
33
中文文本分类实践,基于搜狗新闻语料库,采用传统机器学习方法以及预训练模型等方法
Merge_label
⭐
33
Code for paper "Merge and Label: A Novel Neural Network Architecture for Nested NER", ACL 2019
Cade
⭐
30
Compass-aligned Distributional Embeddings. Align embeddings from different corpora
Spacy Pl
⭐
29
Poleval 2018
⭐
29
Code and data accompanying the paper "Approaching nested named entity recognition with parallel LSTM-CRFs."
Hpca
⭐
29
C++ implementation of the Hellinger PCA for computing word embeddings.
Proqa
⭐
29
Progressively Pretrained Dense Corpus Index for Open-Domain QA and Information Retrieval
Nlp For Tamil
⭐
26
State of the Art Language models and Classifier for Tamil language (spoken in India, and few other South Asian countries)
Twembeddings
⭐
26
Sentence embeddings for unsupervised event detection in the Twitter stream: study on English and French corpora
Hypervec
⭐
25
Hierarchical Embeddings for Hypernymy Detection and Directionality
Exquisite Corpus
⭐
25
Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.
Ml You Can Use
⭐
24
Practical ML and NLP with examples.
Cmdist
⭐
22
DEPRECATED - The Concept Mover's Distance Method is now available in the text2map package. Concept Mover's Distance measure a document's conceptual engagement using word embeddings.
Rectr
⭐
19
💒 Reproducible Extraction of Cross-lingual Topics using R
Form Context Model
⭐
19
This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.
Cdma Ner
⭐
19
Singular
⭐
18
German Elmo Model
⭐
18
This is a german ELMo deep contextualized word representation. It is trained on a special German Wikipedia Text Corpus.
Embeddings
⭐
17
spark job, sangria server, and react front-end for Word2Vec models
Multi Drug Embedding
⭐
17
Method for drug repurposing from knowledge graphs and literature
Word2vecfz
⭐
17
Dependency-based Word Embeddings (Levy and Goldberg, 2014) with BZ2 compression support.
Entity_embedding
⭐
16
Reference implementation of the paper "Word Embeddings for Entity-annotated Texts"
W2v_ol
⭐
15
Using word embeddings (word2vec) for ontology learning
Speech_embeddings
⭐
15
Using embedding-based loss functions for phonetics/speech recognition.
Probabilistic Rnn Da Classifier
⭐
15
Probabilistic Dialogue Act Classification for the Switchboard Corpus using an LSTM model
Lscdetection
⭐
14
Data Sets and Models for Evaluation of Lexical Semantic Change Detection
Sbwce
⭐
14
Spanish Billion Word Corpus and Embeddings
Word2vec Embeddings For Nepali Language
⭐
13
Word Embeddings (Word2Vec) for Nepali Language
Lasertrain
⭐
11
Vec2topic
⭐
11
A Topic Modeling algorithm that extracts the core topics from a text corpus. Implements the algorithm described in R. S. Randhawa, P. Jain, and G. Madan, Topic Modeling Using Distributed Word Embeddings, http://arxiv.org/abs/1603.04747
Rwe
⭐
10
Repository containing data and code of the ACL-19 paper "Relational Word Embeddings"
Semantic_coherence
⭐
10
Measuring semantic (in)coherence in Ubuntu dialogue corpus using different word and knowledge graph embeddings.
Sentilex
⭐
10
Sentiment Lexicon Generation Suite
Nlp Augment
⭐
10
A collection of utilities used in exploring data augmentation of low-resource parallel corpuses.
Cca
⭐
10
Improved Word Embeddings
⭐
10
Improving Word Embeddings by combining word embeddings with their POS (Part Of Speech) tag.
Nc_embeddings
⭐
9
Comparison between various noun compound embeddings
Vcwe
⭐
9
VCWE: Visual Character-Enhanced Word Embeddings (NAACL 2019)
Mednorm Corpus
⭐
9
Naacl_flp_st
⭐
9
System used in the NAACL 2018 Shared Task on Metaphor Detection
Political German Word Embeddings
⭐
9
German word embeddings computed from a corpus of parliamentary transcripts (2017-2019)
Speechtext Wimp Labeler
⭐
9
This project demonstrates the use of generic bi-directional LSTM models for predicting importance of words in a spoken dialgoue for understanding its meaning. The model operates on human-annotated corpus of word importance for its training and evaluation. The corpus can be downloaded from: http://latlab.ist.rit.edu/lrec2018
Tensorflow Swivel
⭐
9
Multi Embedding Cws
⭐
8
Multiple Character Embeddings for Chinese Word Segmentation, ACL 2019
Tutorial
⭐
8
Notebooks and overall materials of the HybridNLP 2018@ISWC2018 tutorial (http://expertsystemlab.com/hybridNLP18/)
Vecshare
⭐
8
This library provides functionality for rapidly sharing and retrieving word embeddings over the internet. (EMNLP 2017).
Max Word Embedding Generator
⭐
7
Generate embedding vectors from text files
Wiki_zh_vec
⭐
7
a python autotool for train Chinese wiki corpus to word embeddings using word2vec ,glove and lexvec.
Int2vec
⭐
7
A playground for embedding spaces for integers using easy to understand corpora.
Pdf2emb_nlp
⭐
7
NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to a given search query
Wordnet Randomwalk Python
⭐
7
Repository of code used for WordNet random walk embedding experiments
Cluse
⭐
7
Cross-Lingual Unsupervised Sense Embeddings
Chat Simulation Tools
⭐
6
Generate and store a corpus from English and Italian Telegram logs, extract word embeddings and tf-idf models and simulate a chat.
Cuneiform
⭐
6
Machine translation and word embeddings of cuneiform corpuses
Airflow Pdf2embeddings
⭐
6
NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to a given search query.
Principal_word_vectors
⭐
6
We use principal component analysis for word embedding. The method is able to process both annotated and raw corpora.
Nlp For Gujarati
⭐
6
State of the Art Language models and Classifier for Gujarati, which is a language native to the Indian state of Gujarat
Act2vec
⭐
6
Repeval_rivercorners
⭐
6
Code developed for the RepEval 2017 Shared Task by the the team Rivercorners
Awesomener
⭐
5
An implementation of bidirectional LSTM-CRF for Named Entity Relationship on custom corpus with custom word embeddings
Miniword2vec
⭐
5
Implementation from scratch of the Skipgram and CBoW (Continuous Bag of Words) model for learning word embeddings from a corpus.
Nlp For Marathi
⭐
5
State of the Art Language models and Classifier for Marathi, which is spoken predominantly by Marathi people of Maharashtra, India
Causalembedding
⭐
5
Acl_srw_2019
⭐
5
This is the code for reproducing the experiments from "Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition" (El Boukkouri et al.)
Unsuppse
⭐
5
Unsupervised parallel sentence extraction from comparable corpora
Ner_tsd2016
⭐
5
Software and data accompanying paper Neural Networks for Featureless Named Entity Recognition in Czech
Recipe_bucktsong_awe_py3
⭐
5
Unsupervised acoustic word embeddings evaluated on Buckeye English and NCHLT Xitsonga data in Python 3.
Sentiment Analysis With Word Embeddings
⭐
5
A Sentiment Analysis model in keras to analyse toxic comments online . The Corpus is preprocessed using Glove Word Embeddings .
Yorubatwi Embedding
⭐
5
Flair Pos Tagging
⭐
5
Flair Embeddings for PoS Tagging: A Multilingual Evaluation
Capricorn
⭐
5
nlp vocabulary builder and embedding loader
Text Categorization Using Neural Word Embeddings
⭐
5
This is a practical implementation implementing neural networks on top of fasttext as well as word2vec word embeddings.
Spanishwordembeddings
⭐
5
Spanish Word Embeddings computed from large corpora and different sizes using fastText.
Related Searches
Python Corpus (2,465)
Python Embeddings (2,141)
Jupyter Notebook Embeddings (708)
Natural Language Processing Corpus (510)
Natural Language Processing Embeddings (385)
Embeddings Word2vec (354)
Dataset Corpus (342)
Java Corpus (308)
Language Corpus (261)
1-98 of 98 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.