Uncertainty Baselines
⭐
1,293
Highquality implementations of standard and SOTA methods on a variety of tasks.
Githut
⭐
842
Github Language Statistics
Wilayah Administratif Indonesia
⭐
639
Data Provinsi, Kota/Kabupaten, Kecamatan, dan Kelurahan/Desa di Indonesia
New Zealand Data
⭐
245
A list of New Zealand Datasets and APIs
Data Set
⭐
199
state driven all in one data process for data visualization
Uci Ml Api
⭐
189
Simple API for UCI Machine Learning Dataset Repository (search, download, analyze)
Msmarco
⭐
183
Utilities, Baselines, Statistics and Descriptions Related to the MSMARCO DATASET
Whylogs Java
⭐
179
Profile and monitor your ML data pipeline endtoend
Mtg Jamendo Dataset
⭐
164
Metadata, scripts and baselines for the MTGJamendo dataset
Impy
⭐
100
Impy is a Python3 library with features that help you in your computer vision tasks.
Population
⭐
92
Population figures for countries, regions (e.g. Asia) and the world.
Openml R
⭐
90
R package to interface with OpenML
Awesome Wikipedia
⭐
76
A curated list of awesome Wikipediarelated frameworks, libraries, software, datasets and references.
Ddf Gapminder Systema_globalis
⭐
76
Gapminder's factbase with local & global statistics
Nwslr
⭐
59
Datasets and Analytics for the National Women's Soccer League (NWSL)
Shakkelha
⭐
59
Neural Arabic text diacritization
Jupyter Notebooks
⭐
52
Jupyter Notebooks and miscellaneous
Ps Dataset
⭐
52
PhotoSynth Dataset for improving local patch Descriptors
Outlier Utils
⭐
50
Utility library for detecting and removing outliers from normally distributed datasets using the SmirnovGrubbs test.
Causeinfer
⭐
45
Machine learning based causal inference/uplift in Python
Id2t
⭐
42
Official ID2T repository. ID2T creates labeled IT network datasets that contain user defined synthetic attacks.
Snowset
⭐
41
Snowflake dataset containing statistics for 70 million queries over 14 day period
Data Science And Machine Learning Resources
⭐
37
List of Data Science and Machine Learning Resource that I frequently use
Vtuber Livechat Dataset
⭐
35
📊 VTuber 1B: Billionscale Live Chat and Moderation Event Dataset
Nomisr
⭐
30
Access UK official statistics from the Nomis database through R.
Haeufige Vornamen Berlin
⭐
29
Open Data on given names for newborn children in Berlin since 2012
Pororoqa
⭐
27
PororoQA, https://arxiv.org/abs/1707.00836
Simpler
⭐
26
Exercises from Verzani's simpleR  Using R for Introductory Statistics
World Cup 2018
⭐
25
An exploratory data analysis and data visualization project for World Cup 2018
Readstattables.jl
⭐
25
Read and write Stata, SAS and SPSS data files with Julia tables
Framester
⭐
24
This repository contains the Framester resource, the main outcome of the framester project.
Taskonomy Sample Model 1
⭐
24
Model, selected at random, from the training set of the paper "Taskonomy: Disentangling Task Transfer Learning"
Evaluation Datasets
⭐
23
Will store links to known evaluation datasets alongside stats to characterize them
Gospn
⭐
22
A free, opensource inference and learning library for SumProduct Networks (SPN)
Udacity Data Analyst Nanodegree
⭐
19
Arabic Text Diacritization
⭐
18
Benchmark Arabic text diacritization dataset
Infinite_stories_with_data
⭐
16
This repo consists of my analysis of random datasets using various statistical and visualization techniques.
Learning_sequence_motifs
⭐
15
"Representation Learning of Genomic Sequence Motifs with Convolutional Neural Networks" by Peter K. Koo and Sean R. Eddy
Data Scientist In Python
⭐
15
This repository contains notes and projects of Data scientist track from dataquest course work.
Semantic_kitti_stats
⭐
14
📉 Get some nice plots with statistics about the Semantic KITTI dataset
Covidstats
⭐
12
COVID19 Statistical Analysis Simulator App using R deployed on shinyapps.io a John Hopkins University COVID count clone and simulator
Compstats
⭐
12
EPSY 887 Computational Statistics: Institute
Czso
⭐
11
Use Open Data from the Czech Statistical Office in R
Singapore Maritime Dataset Frames Ground Truth Generation And Statistics
⭐
11
Repository for generating frames from the Singapore Maritime Dataset videos and converting the corresponding ground truth files. FInally, some basic statistics are generated.
Pydst
⭐
10
PyDST is a python module for accessing the API of Statistics Denmark. https://kristianuruplarsen.github.io/pydst/
Datasetops
⭐
10
Fluent dataset operations, compatible with your favorite libraries
Pga Tour Data Science Project
⭐
10
Covid Da
⭐
9
The dataset used in COVIDDA: Deep Domain Adaptation from Typical Pneumonia to COVID19
India Trade Data
⭐
9
A web scraper written in Python to gather trade data for India across commodities and countries
Lcbench
⭐
8
A learning curve benchmark on OpenML data
Lr Identify
⭐
7
Rl Cache
⭐
7
Presidentielle 2017
⭐
7
Datasets & statistics about the 2017 French presidential election
Laliga Dataset
⭐
6
LaLiga 20182019 Season  Advanced Player Statistics Dataset
Mini Project
⭐
5
MiniProjects in Master's (Big Data & Data Analytics) at Manipal University
Linked Edit Rules
⭐
5
Linked Edit Rules: a methodology to publish, link, combine and execute edit rules on the Web as Linked Data to verify consistency of statistical datasets. Compliant Linked Data Notifications (LDN) sender.
For X In Datasets
⭐
5
Run your statistics/ML procedure on many real datasets... easily!
Dataset
⭐
5
JSer.infoのデータセットや処理ライブラリ
Rms Letter Comparison
⭐
5
GitHub petitions regarding removal of rms: contributor comparison data and computational analysis
Spatstat.data
⭐
5
Subpackage of spatstat containing all datasets
List Of Federal And State Datasets
⭐
5
A list of officially verified datasets and statistics on a federal and state level 🇺🇸
Epidemiology Tools
⭐
5
A list of tools to assist with epidemiology research
Vips_code
⭐
4
Jacs Dataset Analysis
⭐
4
Reanalysis of the Schrödinger JACS dataset
Desc
⭐
4
A fast and simple descriptive statistics tool for the UNIX command line.
Jupytercon 2017
⭐
4
Material for my talk at JupyterCon 2017
Awesome R
⭐
3
Awesome R
Qds
⭐
3
Quantile Datacube Structure (Main Source Code)
Outlier2
⭐
3
Find outliers in dataset
Youtubeanalytics
⭐
3
This repository contains the work for the Data Science course (UE18CS203)
Github_analytics_project
⭐
3
Codebase for GitHub Analytics Project
Dsjobtracker
⭐
3
What skills and qualifications are required for a data scientist?
Chi Sq Test
⭐
3
npm package to run ChiSquared tests on numerical arrays.
Gpam_stats
⭐
3
Implementation of statistics used for paper soon to be cited here
Ckanext Semantic
⭐
3
integration of lodstats and personalization features based on it
Bls_local_area_unemployment
⭐
3
A scraper and dataset with all Local Area Unemployment data from the US Bureau of Labor Statistics.
Itns
⭐
3
Introduction to the New Statistics datasets
Ultima School_portfolio
⭐
2
Portfólio da curso de análise de dados da Ultima School
Ohtadstats
⭐
2
Tomoka Ohta's D Statistics
Ptt_stock
⭐
2
Data Exploration Of Trending Youtube Video Statistics Dataset
⭐
2
Data Exploration of Trending YouTube Video Statistics DataSet
Pizza_delivery
⭐
2
This repos will be for my pizza delivery data analysis and statistics. This is where I tackle simple questions to learn more about statistics and Python with a dataset I collected and know very well  my pizza delivery data!
Statistical Inference
⭐
2
A little exploration of R's power for statistical inference
Fluent Data
⭐
2
Manipulate datasets by chaining methods. Includes capacity to map, filter, sort, group, reduce, and merge data. Built in reducers include multiple regression.
Edtar
⭐
2
Dataset packages for environmental data analysis using R with tidy approach
Dbs Statistics
⭐
2
B9DA101 Statistics for Data Analytics
Rupository
⭐
2
A series of analyses on data from the best reality competition, RuPaul's Drag Race
Applestorer
⭐
2
Apple itunes mobile app statistics
Overwatch Ranked Data
⭐
2
Dataset of my ranked Overwatch matches
Titanicml
⭐
2
An analysis and deployment of a machine learning algorithm on the Titanic Dataset from Kaggle.com.
Earthquakes
⭐
2
STAT 141B Exploratory Data Analysis Project
Datasets Anscombes Quartet
⭐
2
Anscombe's quartet.
Datasets Harrison Boston House Prices
⭐
2
A dataset derived from information collected by the US Census Service concerning housing in Boston, Massachusetts (1978).
Datasets Nightingales Rose
⭐
2
Dataset for Nightingale's famous polar area diagram.
Datasets Suthaharan Single Hop Sensor Network
⭐
2
Labeled wireless sensor network data set collected from a simple singlehop wireless sensor network deployment using TelosB motes.
Nodejs Planetos
⭐
2
Access the Planet OS API with your Node.js app (unofficial)
Rollercoasters Caret
⭐
2
Multiple regression in R using caret
