Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Awesome Public Datasets | 62,307 | 4 months ago | 126 | mit | ||||||
A topic-centric list of HQ open datasets. | ||||||||||
Codesearchnet | 2,202 | 3 years ago | 7 | mit | Jupyter Notebook | |||||
Datasets, tools, and benchmarks for representation learning of code. | ||||||||||
Fma | 1,773 | 2 years ago | 10 | mit | Jupyter Notebook | |||||
FMA: A Dataset For Music Analysis | ||||||||||
Open Data Registry | 1,271 | a year ago | 26 | apache-2.0 | Python | |||||
A registry of publicly available datasets on AWS | ||||||||||
Qri | 1,053 | 1 | 3 years ago | 271 | December 13, 2021 | 220 | gpl-3.0 | Go | ||
you're invited to a data party! | ||||||||||
Data Juicer | 994 | a year ago | 3 | September 28, 2023 | 16 | apache-2.0 | Python | |||
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据! | ||||||||||
Openml | 671 | 4 months ago | 364 | bsd-3-clause | PHP | |||||
Open Machine Learning | ||||||||||
Covid 19 Repo Data | 442 | 2 years ago | 15 | cc0-1.0 | ||||||
Data archive of identifiable COVID-19 related public projects on GitHub | ||||||||||
Ucf Sst Citysim Dataset | 283 | a year ago | apache-2.0 | Python | ||||||
Official github page of UCF SST CitySim Dataset | ||||||||||
Rsocrata | 227 | 2 | 2 | 2 years ago | 21 | August 31, 2023 | 40 | other | R | |
Provides easier interaction with Socrata open data portals http://dev.socrata.com. Users can provide a 'Socrata' data set resource URL, or a 'Socrata' Open Data API (SoDA) web query, or a 'Socrata' "human-friendly" URL, returns an R data frame. Converts dates to 'POSIX' format. Manages throttling by 'Socrata'. |