Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for dataset language
dataset
x
language
x
66 search results found
Codexglue
⭐
1,297
CodeXGLUE
Githut
⭐
918
Github Language Statistics
Hate Speech And Offensive Language
⭐
698
Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017
Tranx
⭐
368
A general-purpose neural semantic parser for mapping natural language queries into machine executable code
Dataset
⭐
194
darija <-> english dataset
Sign Language Digits Dataset
⭐
179
Turkey Ankara Ayrancı Anadolu High School's Sign Language Digits Dataset
Flatdata
⭐
170
Write-once, read-many, minimal overhead binary structured file format.
Natural Language Summary Generation From Structured Data
⭐
155
Implementation of the paper -> https://arxiv.org/abs/1709.00155. For converting information present in the form of structured data into natural language text
Danlp
⭐
141
DaNLP is a repository for Natural Language Processing resources for the Danish Language.
Pre Modern_chinese_corpus_dataset
⭐
132
近代汉语语料库数据集 自然语言处理 语料库 古代汉语 古汉语 文言文 数字人文 计算语言
Personality Prediction
⭐
130
Experiments for automated personality detection using Language Models and psycholinguistic features on various famous personality datasets including the Essays dataset (Big-Five)
How2 Dataset
⭐
125
This repository contains code and metadata of How2 dataset
Spoken_language_identification
⭐
105
Identify a spoken language using artificial intelligence (LID).
Ml Mkqa
⭐
94
We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Please refer to our paper for details, MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering
Pg19
⭐
92
Devdata.io
⭐
76
The Data You Need, The Programming Language You Want
Aida
⭐
71
🤖💬 Tiny experimental NLP deep learning library for text classification and NER. Built with Tensorflow.js, Keras and Chatito. Implemented in JS and Python.
Mmner
⭐
69
Massively Multilingual Transfer for NER
Toolbox
⭐
67
A collection of tools, APIs and other resources to use in creative coding web projects.
Slm Code Generation
⭐
62
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)
Vocalize Sign Language
⭐
62
Vocalization sign language with deep learning.
Lc Quad
⭐
61
A data set of natural language queries with corresponding SPARQL queries
Jejueo
⭐
56
Jejueo Datasets for Machine Translation and Speech Synthesis
Id Multi Label Hate Speech And Abusive Language Detection
⭐
55
The Dataset for Multi Label Hate Speech and Abusive Language Detection in Indonesian Twitter
Indic.page
⭐
52
A directory of Indic (Indian) language computing resources.
Cldf
⭐
46
CLDF: Cross-Linguistic Data Formats - the specification
Autocorpus
⭐
38
AutoCorpus is a set of utilities that enable automatic extraction of language corpora and language models from publicly available datasets. Autocorpus utilities follow the Unix design philosophy and integrate easily into custom data processing pipelines.
Living Audio Dataset
⭐
37
A "Crowd-Built" continuously growing speech dataset with transcripts. The dataset contains multiple languages and is intended for anyone to be able to add to it.
Natural Language Object Retrieval Tensorflow
⭐
34
Implement Natural Language Object Retrieval in tensorflow
Nli_generation
⭐
32
Natural Language Inference Dataset Generation
Dspl
⭐
31
Schema and utilities for Google Dataset Publishing Language
Wikilingua
⭐
30
Multilingual abstractive summarization dataset extracted from WikiHow.
Naturallanguageprocessing
⭐
28
Natural Language Procesing
Localization Xml Mt
⭐
27
A High-Quality Multilingual Dataset for Structured Documentation Translation
Noisemix
⭐
27
NoiseMix - data generation for natural language
Awesome Kurdish
⭐
25
A curated list of awesome resources and tools for Kurdish language technology
Flatline
⭐
24
Documentation, examples and utilities for Flatlline, BigML's dataset transformation and generation language
Awesome Azeri Nlp
⭐
24
Azerbaijani language processing software, models and datasets.
Gesture Recognition For Indian Sign Language Using Kinect
⭐
23
Indian Sign Language static gesture depth dataset
Analects
⭐
23
Public datasets on the Chinese language, accessible from Ruby
Word Language Model
⭐
23
Pytorch world language model (text generation) for PTB dataset example
Shaman
⭐
23
Programming Language Detector - When you input code, Shaman detects its language
Meta Transfer Learning
⭐
22
Implementation of meta-transfer-learning (ACL 2020)
Ckanext Fluent
⭐
22
Multilingual fields for CKAN
Slta
⭐
22
ACM ICMR 2019《Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention》
Gutenberg Dialog
⭐
20
Build a dialog dataset from online books in many languages
Kuliya
⭐
17
Algeria's college hierarchy dataset as packages for different languages and platforms
Farsinlp.github.io
⭐
17
Datasets for Farsi (Persian) Natural Language Processing (NLP)
Quizzn
⭐
16
This is an open-source quiz application. The current dataset is for world capitals, but it can also be easily adapted to work with language vocabulary learning as well.
Spoken_language_dataset
⭐
15
The dataset with English, German and Spanish speech samples.
Kk S Paperlist
⭐
15
A list of papers for machine learning, reinforcement learning, NLP or something interesting
Nlp
⭐
15
Natural-language processing library
Documentclip
⭐
14
Gmql
⭐
14
GMQL - GenoMetric Query Language
Machine Learning Experiments
⭐
13
These are my jupyter notebooks on ML & DL.
Ffr V1
⭐
13
Towards developing a Robust Translation Model for African languages: Pilot Project FFR v1.0.
Retired_comedy_phrases
⭐
13
A Casual Spreadsheets resource
Lidtk
⭐
12
Language Identification Toolkit
Clj Example Nlp Ml
⭐
12
Example Project for Natural Language Processing and Machine Learning Libraries
Datasets Cmudict
⭐
12
The Carnegie Mellon Pronouncing Dictionary (CMUdict).
Tf Idf Iif Top100 Wordlists
⭐
11
These are lists for a variety of languages containing words that are distinctive to each language.
Abusive Language
⭐
11
Dcgan Sign Language
⭐
10
Generating sign language images with DCGAN using our own Sign Language Dataset
Mmid
⭐
10
Words and their images in 98 languages
Langdetect
⭐
9
A language detection software
Language Dataset
⭐
9
Dataset for programming language identification.
Nlp For Odia
⭐
9
State of the Art Language models and Classifier for Odia, which is spoken in the Indian state of Odisha
Wip Lambada Lm
⭐
9
LSTM language model on LAMBADA dataset
Pru
⭐
9
Pyramidal Recurrent Units (PRUs): A New LSTM Unit
Mylist_thainlp_group
⭐
9
Iso639 Databases
⭐
8
ISO 639 tables for managing autonyms, LCIDs and default ISO 15924 scripts in a machine-readable format
Rhxl
⭐
8
Humanitarian Exchange Language (HXL standard) in R
Language Model Recommendation
⭐
8
Resources accompanying the "Zero-Shot Recommendation as Language Modeling" paper (ECIR2022)
Nlp For Kannada
⭐
8
State of the Art Language models and Classifier for Kannada, which is spoken predominantly by Kannada people in India, mainly in the state of Karnataka
Megacov
⭐
8
Mega-COV: A Billion-Scale Dataset of 100+ Languages for COVID-19
Language2motion
⭐
7
The goal of this project is to create multi-modal implementation of Transformer architecture in Swift.
Ted Dataset
⭐
7
Az Summarization
⭐
7
Abstractive summarization for Azerbaijani language
Am For Bert
⭐
7
This repository contains the WordNet Language Model Probing (WNLaMPro) dataset introduced in "Rare Words: A Major Problem for Contextualized Embeddings and How to Fix it by Attentive Mimicking".
Arabic Nlp Resources
⭐
7
📚 This project holds an inventory of NLP resources for Arabic.
Babel
⭐
7
Translation without parallel corpora.
Hindi Nli Data
⭐
6
a repository containing the details of natural language inference dataset in Hindi
Ea Associate Ds
⭐
6
Electronic Arts (EA) NLP Assignment for: Associate Data Scientist
Wikiloop Datasets
⭐
6
M Amr2text
⭐
6
Generate from English-Centric AMR into Multiple Languages.
Any Language Frames
⭐
6
Multilingual datasets for the paper "Any-language frame-semantic parsing"
Language Detection Speech Using Dnn
⭐
6
This is the implementation of a DNN in tensorflow for language detection in an audio file
Progress In Natural Language Processing
⭐
5
This document Focus to track the progression in the field of Natural Language Processing (NLP) and give an overview of the state-of-the-art across the most common NLP tasks and their corresponding datasets. It aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging as well as more recent ones such as reading comprehension and natural language inference. The main objective is to provide the reader with a quick overview of benchmark datasets and the
Akshar
⭐
5
Free, Open Source Gujarati Language Characters Small Dataset For Deep Learning
Handshape_datasets
⭐
5
A single library to (down)load all existing sign language handshape datasets.
Geobench
⭐
5
A place to put data (in the releases) and code (in any language) to test performance
Universal Joy
⭐
5
A Dataset and Results for Classifying Emotions Across Languages
Nordicdsl
⭐
5
Kerasesim
⭐
5
Keras implementation of ESIM model for Natural Language Inference.
Unified_multilingual_dataset_of_emotional_human_utterances
⭐
5
A unified dataset of multilingual emotional human utterances
Sign_language_datasets
⭐
5
A single library to (down)load all existing sign language video datasets.
Related Searches
Python Dataset (14,792)
Jupyter Notebook Dataset (6,824)
Python Language (4,480)
Javascript Language (4,116)
Language Programming (3,959)
Java Language (2,399)
Deep Learning Dataset (2,364)
Machine Learning Dataset (2,279)
C Plus Plus Language (1,971)
Dataset Pytorch (1,847)
1-66 of 66 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.