Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Awesome Semantic Segmentation | 9,908 | 2 years ago | 13 | |||||||
:metal: awesome-semantic-segmentation | ||||||||||
Fashion Mnist | 9,856 | a year ago | 24 | mit | Python | |||||
A MNIST-like fashion product database. Benchmark :point_down: | ||||||||||
Mmaction2 | 3,169 | a day ago | 22 | May 05, 2022 | 175 | apache-2.0 | Python | |||
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark | ||||||||||
Benchmark_results | 2,992 | 3 years ago | 17 | |||||||
Visual Tracking Paper List | ||||||||||
Benchmarking Gnns | 2,137 | 2 months ago | 5 | mit | Jupyter Notebook | |||||
Repository for benchmarking graph neural networks | ||||||||||
Benchm Ml | 1,839 | 9 months ago | 11 | mit | R | |||||
A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning algorithms for binary classification (random forests, gradient boosted trees, deep neural networks etc.). | ||||||||||
Fast Deep Equal | 1,596 | 587,978 | 1,432 | 8 months ago | 14 | June 08, 2020 | 23 | mit | JavaScript | |
The fastest deep equality check with Date, RegExp and ES6 Map, Set and typed arrays support | ||||||||||
Pycm | 1,382 | 5 | 8 | 7 days ago | 39 | April 27, 2022 | 12 | mit | Python | |
Multi-class confusion matrix library in Python | ||||||||||
Avalanche | 1,364 | a day ago | 2 | June 14, 2022 | 106 | mit | Python | |||
Avalanche: an End-to-End Library for Continual Learning based on PyTorch. | ||||||||||
Deepmoji | 1,331 | a year ago | 9 | mit | Python | |||||
State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc. |
This repository is no longer being updated.
Please refer to the Diabetic Retinopathy Detection implementation in Google's 'uncertainty-baselines' repo for up-to-date baseline implementations.
In order to make real-world difference with Bayesian Deep Learning (BDL) tools, the tools must scale to real-world settings. And for that we, the research community, must be able to evaluate our inference tools (and iterate quickly) with real-world benchmark tasks. We should be able to do this without necessarily worrying about application-specific domain knowledge, like the expertise often required in medical applications for example. We require benchmarks to test for inference robustness, performance, and accuracy, in addition to cost and effort of development. These benchmarks should be at a variety of scales, ranging from toy MNIST
-scale benchmarks for fast development cycles, to large data benchmarks which are truthful to real-world applications, capturing their constraints.
Our BDL benchmarks should
TensorFlow
, PyTorch
, etc.), and integrate with the SciPy ecosystem (i.e., NumPy
, Pandas
, Matplotlib
). Benchmarks are framework-agnostic, while baselines are framework-dependent.In this repo we strive to provide such well-needed benchmarks for the BDL community, and collect and maintain new baselines and benchmarks contributed by the community. A colab notebook demonstrating the MNIST-like workflow of our benchmarks is available here.
We highly encourage you to contribute your models as new baselines for others to compete against, as well as contribute new benchmarks for others to evaluate their models on!
Bayesian Deep Learning Benchmarks (BDL Benchmarks or bdlb
for short), is an open-source framework that aims to bridge the gap between the design of deep probabilistic machine learning models and their application to real-world problems. Our currently supported benchmarks are:
[x] Diabetic Retinopathy Diagnosis (in alpha
, following Leibig et al.)
[ ] Autonomous Vehicle's Scene Segmentation (in pre-alpha
, following Mukhoti et al.)
[ ] Galaxy Zoo (in pre-alpha
, following Walmsley et al.)
[ ] Fishyscapes (in pre-alpha
, following Blum et al.)
BDL Benchmarks is shipped as a PyPI package (Python3 compatible) installable as:
pip3 install git+https://github.com/OATML/bdl-benchmarks.git
The data downloading and preparation is benchmark-specific, and you can follow the relevant guides at baselines/<benchmark>/README.md
(e.g. baselines/diabetic_retinopathy_diagnosis/README.md
).
For example, the Diabetic Retinopathy Diagnosis benchmark comes with several baselines, including MC Dropout, MFVI, Deep Ensembles, and more. These models are trained with images of blood vessels in the eye:
The models try to predict diabetic retinopathy, and use their uncertainty for prescreening (sending patients the model is uncertain about to an expert for further examination). When you implement a new model, you can easily benchmark your model against existing baseline results provided in the repo, and generate plots using expert metrics (such as the AUC of retained data when referring 50% most uncertain patients to an expert):
You can even play with a colab notebook to see the workflow of the benchmark, and contribute your model for others to benchmark against.
Please cite individual benchmarks when you use these, as well as the baselines you compare against. For the Diabetic Retinopathy Diagnosis benchmark please see here.
The repository is developed and maintained by the Oxford Applied and Theoretical Machine Learning group, with sponsorship from:
![]() |
![]() |
![]() |
![]() |
Email us for questions at [email protected], or submit any issues to improve the framework.