Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for jupyter notebook corpus
corpus
x
jupyter-notebook
x
166 search results found
Laser
⭐
3,460
Language-Agnostic SEntence Representations
Chatbot Retrieval
⭐
1,545
Dual LSTM Encoder for Dialog Response Generation
Ubuntu Ranking Dataset Creator
⭐
570
A script that creates train, valid and test datasets for the ranking task from Ubuntu corpus dialogs.
Document_cluster
⭐
440
A guide to document clustering in Python
Malaysian Dataset
⭐
263
We gather Malaysian dataset! https://malaysian-dataset.readthedocs.io/en/latest
Corus
⭐
254
Links to Russian corpora + Python functions for loading and parsing
Germanwordembeddings
⭐
224
Toolkit to obtain and preprocess german corpora, train models using word2vec (gensim) and evaluate them with generated testsets
Parsbert
⭐
222
🤗 ParsBERT: Transformer-based Model for Persian Language Understanding
Robbert
⭐
180
A Dutch RoBERTa-based language model
Asrframe
⭐
155
An Automatic Speech Recognition Frame ,一个中文语音识别的完整框架, 提供了多个模型
Emobank
⭐
154
This repository contains EmoBank, a large-scale text corpus manually annotated with emotion according to the psychological Valence-Arousal-Dominance scheme.
Gossiping Chinese Corpus
⭐
136
PTT 八卦版問答中文語料
Fairseq Zh En
⭐
110
NMT for chinese-english using fairseq
Weak Supervision For Ner
⭐
97
Framework to learn Named Entity Recognition models without labelled data using weak supervision.
Bitcoin Value Predictor
⭐
90
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Gutenberg Poetry Corpus
⭐
83
A corpus of poetry from Project Gutenberg
Pythia
⭐
77
Supervised learning for novelty detection in text
Wasabidataset
⭐
73
Repo for the Wasabi datasets
Turkish Parliament Texts
⭐
61
The transcripts of Grand National Assembly of Turkish Parliament (TBMM) meetings which span nearly a century between 1920 and 2015.
Vietnamese Electra
⭐
59
Electra pre-trained model using Vietnamese corpus
Nlp For Hindi
⭐
59
State of the Art Language models and Classifier for Hindi language (spoken in Indian sub-continent)
Wisesight Sentiment
⭐
53
Thai Social Media Sentiment Dataset
Polbert
⭐
52
Polish BERT
Broad_twitter_corpus
⭐
52
The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors
Turkish Glove
⭐
51
Türkçe GloVe - Repository for Turkish GloVe Word Embeddings
Bionlp
⭐
45
Repository for student projects within biomedical text mining from Lund University
Recurrent Neural Networks Intro
⭐
42
Implementation of RNN in Python
Nlp Qrmine
⭐
40
Qualitative Research support tools in Python
Corpus To Graph Ml
⭐
39
This repository contains machine learning related work for the corpus to graph project, including Jupyter research notebooks and a Flask webservice to host the model
Corpus Db
⭐
38
A textual corpus database for the digital humanities.
Named Entity Recognition Template
⭐
38
Build a deep learning model for predicting the named entities from text.
Word2vec
⭐
38
訓練中文詞向量 Word2vec, Word2vec was created by a team of researchers led by Tomas Mikolov at Google.
Topic Labeling
⭐
35
The project proposes a framework to apply topic models on a text-corpus and eventually topic labels on the generated topics.
Tomodapi
⭐
35
Train, evaluate, and use different unsupervised topic modelling algorithms using a RESTful API.
Unmt
⭐
35
Code inspired by Unsupervised Machine Translation Using Monolingual Corpora Only
Shabby Pages
⭐
34
ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to original denoised documents.
Align Linguistic Alignment
⭐
33
Python library for extracting quantitative, reproducible metrics of multi-level alignment between two speakers in naturalistic language corpora.
Dl4mt Simul Trans
⭐
32
Generating Text Small Corpus
⭐
29
Generating style-specific text from a small corpus of 2.5k sentences using a pre-trained language model. Code in PyTorch
Chatbot Retrieval
⭐
29
implement retrieval-based chatbot. see more in this [blog](http://blog.csdn.net/Irving_zhang/article/d
Cometa
⭐
28
Corpus of Online Medical EnTities: the cometA corpus
Drfaq
⭐
28
DrFAQ is a plug-and-play question answering NLP chatbot that can be generally applied to any organisation's text corpora.
Nlp For Tamil
⭐
26
State of the Art Language models and Classifier for Tamil language (spoken in India, and few other South Asian countries)
Potter
⭐
25
Using NLTK to run an analysis on the Harry Potter corpus - different versions of this talk were given at Codeland in NYC and DjangoCon US in San Diego in 2018.
Retrieval Based_chatbot
⭐
24
Ml You Can Use
⭐
24
Practical ML and NLP with examples.
Word2veclite
⭐
24
Python implementation of Word2Vec
Chinese Gpt
⭐
23
Chinese Transformer Generative Pre-Training Model
How I Extracted Ted Talks For Parallel Corpus
⭐
22
Deepsentipers
⭐
22
Repository for the experiments described in the paper named "DeepSentiPers: Novel Deep Learning Models Trained Over Proposed Augmented Persian Sentiment Corpus"
Arxiv Manatee Publicupdates
⭐
22
This will be a public page to inform about updates to our models being developed to search through Arxiv-Sanity papers
Namu_wiki_db_preprocess
⭐
22
A python script to convert namu wiki database to huge Korean language corpus
Bioner
⭐
21
Un General Debates
⭐
20
Analysis and experiments on the UN General Debate corpus
Text Scraping Document Clustering Topic Modeling
⭐
19
The objective of this project is to scrape a corpus of news articles from a set of web pages, pre-process the corpus, and then to apply unsupervised clustering algorithms to explore and summarise the contents of the corpus. Part 1. Text Data Scraping This part of the project should be implemented as a Python script 1. Identify the URLs for all news articles listed on the website: http://mlg.ucd.ie/modules/COMP41680/news/index.htm 2. Retrieve all web pages corresponding to these article URLs.
Depression Detection In Speech
⭐
17
Detecting depression in a conversation using Convolutional Neral Network
Plot To Poem
⭐
17
"Translate" a plot from Mark Riedl's WikiPlots corpus into a poem. For NaPoGenMo 2017.
Icytranslate_offline
⭐
17
The offline part of icytranslate(a english-chinese translate platform) ,the output of this project should be a translate model
Core Stories
⭐
16
All the notebooks for the analysis of Emotional Arcs within the Project Gutenberg corpus, see "The emotional arcs of stories are dominated by six basic shapes"
Haikurnn
⭐
16
A project to generate haikus with recurrent neural networks while enforcing the 5-7-5 syllable structure
Biobertpt
⭐
16
Biomedical and Clinical BERT for Portuguese Language
Brookings Nlp
⭐
16
Teaching materials for the text analytics course
Searchbetter
⭐
16
SearchBetter: query rewriting for search engines on small corpuses (Harvard research project)
Corpus Driven Narrative Generation
⭐
15
Thoughts toward and tutorial on corpus-driven narrative generation
Govuk Lda Tagger
⭐
15
An experiment of using the LDA machine learning algorithm to generate topics from documents and tag them with those topics
Sentiment.datalogue
⭐
14
Sentiment analysis challenge for Datalogue recruiting
Albert_finetune_with_pretrain_on_custom_corpus
⭐
14
1. Pretrain Albert on custom corpus 2. Finetune the pretrained Albert model on downstream task
Pylighter
⭐
14
Annotation tool on Jupyter for Named Entity Recognition tasks
Tacotron Chinese
⭐
14
Metaphor Paraphrase
⭐
14
German2vec
⭐
13
Language Model and Text Classification for German Language using Deep Learning
Prachathai 67k
⭐
13
News Article Corpus from Prachathai.com
Nanotext
⭐
13
Proteins as words, genomes as documents.
Cnn Ld Tf
⭐
13
Convolutional Neural Network for Language Detection in Tensorflow
Spanishtransformerxl
⭐
12
Language model trained on wiki corpus (500M tokens) with fastai v1 acc>42.3% len(vocab)=60K
Capstone
⭐
12
Creation of LDA (Latent Dirichlet Allocation) Topic Model on corpus of books harvested from Project Gutenberg
Nasslli2018 Corpus Linguistics
⭐
12
Course home for "Corpus Linguistics with Python and NLTK", part of NASSLLI 2018
Colab Gensim Mallet
⭐
12
This repository is designed for students in DIGI405 at the University of Canterbury to do topic modeling through their browser using Google Colab. It is relevant for others who want to do topic modeling through a browser with their own corpus.
Unhealthy Conversations
⭐
12
A corpus of comments tagged for multiple attributes of unhealthiness.
Kleis Keyphrase Extraction
⭐
12
Kleis is a python package to label keyphrases in scientific text.
Nltweets
⭐
12
"Our corpus is tweets."
Thaigov V2 Corpus
⭐
11
Thai News Dataset from Thai government website.
Legalst 190
⭐
11
Data, Prediction, and Law
Exp Ml Bert
⭐
11
Albert Mongolian
⭐
11
ALBERT trained on Mongolian text corpus
Lovecraft
⭐
11
A basic NLTK demo, using the collected works of H. P. Lovecraft as a corpus
Nmt En Tr
⭐
11
Neural Machine Translation Between English and Turkish with pre-trained model releases
Perceive
⭐
10
PERCEIVE is a project incubator inspired by Apache Incubator and Stack Exchange's Area 51. It serves as a staging zone repository for the project early ideas.
Gender_novels
⭐
10
Descriptions of Gender in Novels 1770-1922
Gutenberg Analysis
⭐
10
Analysis of gutenberg dataset
Distributional Inclusion Vector Embedding
⭐
10
Method Name Prediction
⭐
10
Implementation of 'A Convolutional Attention Network for Extreme Summarization of Source Code'
Childes Db
⭐
10
A SQL interface for the CHILDES child language corpora
Msc
⭐
9
Malayalam Speech Corpus
Deep Learning Nlp Pydata
⭐
9
Thai Law
⭐
9
Thai Law Dataset (Act of Parliament)
Syntactic Generalization
⭐
9
Code and data for "A Systematic Assessment of Syntactic Generalization in Neural Language Models"
Deepguru
⭐
9
Deep learning twitter bot generating funny inspirational quotes
Language Modeling
⭐
9
Language modeling on the Penn Treebank (PTB) corpus using a trigram model with linear interpolation, a neural probabilistic language model, and a regularized LSTM.
Deep Text Eval
⭐
9
Differnable Readability Measure Regularizer for Neural Network Automatic Text Simplification
Related Searches
Python Jupyter Notebook (12,976)
Jupyter Notebook Machine Learning (8,463)
Jupyter Notebook Dataset (6,824)
Jupyter Notebook Deep Learning (6,566)
Jupyter Notebook Tensorflow (4,771)
Jupyter Notebook Data Science (4,256)
Jupyter Notebook Convolutional Neural Networks (4,218)
Jupyter Notebook Classification (3,939)
Jupyter Notebook Neural (3,926)
Jupyter Notebook Pytorch (3,877)
1-100 of 166 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.