Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Fashion Mnist | 9,856 | a year ago | 24 | mit | Python | |||||
A MNIST-like fashion product database. Benchmark :point_down: | ||||||||||
Nlp_chinese_corpus | 8,344 | a day ago | 20 | mit | ||||||
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP | ||||||||||
Clue | 3,345 | a day ago | 73 | Python | ||||||
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard | ||||||||||
Benchmarking Gnns | 2,137 | 2 months ago | 5 | mit | Jupyter Notebook | |||||
Repository for benchmarking graph neural networks | ||||||||||
Deepmoji | 1,331 | a year ago | 9 | mit | Python | |||||
State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc. | ||||||||||
Codexglue | 1,124 | 4 days ago | 21 | mit | C# | |||||
CodeXGLUE | ||||||||||
Beir | 872 | 3 | a month ago | 28 | June 30, 2022 | 60 | apache-2.0 | Python | ||
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets. | ||||||||||
Tdc | 809 | 1 | 11 days ago | 26 | February 20, 2022 | 28 | mit | Jupyter Notebook | ||
Therapeutics Data Commons: Artificial Intelligence Foundation for Therapeutic Science | ||||||||||
Medmnist | 764 | 23 days ago | 3 | May 06, 2022 | apache-2.0 | Python | ||||
[pip install medmnist] 18 MNIST-like Datasets for 2D and 3D Biomedical Image Classification | ||||||||||
Matterport | 746 | 6 months ago | 43 | mit | C++ | |||||
Matterport3D is a pretty awesome dataset for RGB-D machine learning tasks :) |
This repository contains code for the following two papers:
The code is authored by Daniela Massiceti and built using PyTorch 1.13.1, TorchVision 0.14.1, and Python 3.7.
cd ORBIT-Dataset
# if using Anaconda
conda env create -f environment.yml
conda activate orbit-dataset
# if using pip
pip install -r requirements.txt
The following script downloads the benchmark dataset into a folder called orbit_benchmark_<FRAME_SIZE>
at the path folder/to/save/dataset
. Use FRAME_SIZE=224
to download the dataset already re-sized to 224x224 frames. For other values of FRAME_SIZE
, the script will dynamically re-size the frames accordingly:
bash scripts/download_benchmark_dataset.sh folder/to/save/dataset FRAME_SIZE
Alternatively, the 224x224 train/validation/test ZIPs can be manually downloaded here. Each should be unzipped as a separate train/validation/test folder into folder/to/save/dataset/orbit_benchmark_224
. The full-size (1080x1080) ZIPs can also be manually downloaded and scripts/resize_videos.py
can be used to re-size the frames if needed.
The following script summarizes the dataset statistics:
python3 scripts/summarize_dataset.py --data_path path/to/save/dataset/orbit_benchmark_<FRAME_SIZE>
# to aggregate stats across train, validation, and test collectors, add --combine_modes
These should match the values in Table 2 (combine_modes=True
) and Table A.2 (combine_modes=False
) in the dataset paper. The Jupyter notebook scripts/plot_dataset.ipynb
can be used to plot bar charts summarizing the dataset (uses Plotly). These should match Figure 2 (combine_modes=True
) and Figure A.3/A.4 (combine_modes=False
) in the dataset paper.
The following describes the protocols for training and testing models on the ORBIT Benchmark.
The training protocol is flexible and can leverage any training regime (e.g. episodic learning, self-supervised learning). There are no restrictions on the choice of model/feature extractor, or how users/objects/videos/frames are sampled.
What data can be used:
What data cannot be used:
We have updated the evaluation protocol for the ORBIT benchmark (compared to the original dataset paper) following the ORBIT Few-Shot Object Recognition Challenge 2022:
object_not_present_issue=True
; see Filtering by annotations section). If after filtering, a clutter video has less than 50 valid frames, the video should be excluded from the evaluation. If it has 50-200 valid frames then all these frames should be included.For each test user's task, a model must be personalized to all the user's objects using only the support (clean) videos and associated labels for those objects. Note, any method of personalization can be used (e.g. fine-tuning, parameter generation, metric learning).
What data can be used to personalize:
What data cannot be used to personalize:
Once a model has been personalized to a test user's task, the model should be evaluated on the task's query set which should contain all that user's clutter videos. Predictions should be made for 200 randomly sampled frames per clutter video, ensuring that no sampled frames have object_not_present_issue=True
. For each frame, the personalized model should predict which one object is present from all the user's objects. The frame accuracy metric should be calculated over the 200 randomly sampled frames for each clutter video in the task's query set.
Note, before sampling the 200 frames, the video should be filtered to exclude all frames that do not contain the ground-truth object (i.e. object_not_present_issue=True
; see Filtering by annotations section). If after filtering, a clutter video has less than 50 valid frames, the video should be excluded from the evaluation. If it has 50-200 valid frames then all these frames should be included.
What data can be used to make a frame prediction:
What data cannot be used to make a frame prediction:
The following scripts can be used to train and test several baselines on the ORBIT benchmark. We provide support for 224x224 frames and the following feature extractors: efficientnet_b0
(pre-trained on ImageNet-1K), efficientnet_v2_s
, vit_s_32
, and vit_b_32
(all pre-trained on ImagetNet-21K), and vit_b_32_clip
(pre-trained on Laion2B).
All other arguments are described in utils/args.py
. Note that the Clutter Video Evaluation (CLU-VE) setting is run by specifying --context_video_type clean --target_video_type clutter
. Experiments will be saved in --checkpoint_dir
. All other implementation details are described in Section 5 and Appendix F of the dataset paper.
Note, before training/testing remember to activate the conda environment (conda activate orbit-dataset
) or virtual environment. If you are using Windows (or WSL) you may need to set workers=0
in data/queues.py
as multi-threaded data loading is not supported. You will also need to enable longer file paths as some file names in the dataset are longer than the system limit.
CNAPS+LITE. Our implementation of the model-based few-shot learner CNAPs (Requeima et al., NeurIPS 2019) is trained with LITE on a Tesla V100 32GB GPU (see Table 1):
python3 single-step-learner.py --data_path folder/to/save/dataset/orbit_benchmark_224 \
--feature_extractor efficientnet_b0 \
--classifier versa --adapt_features \
--context_video_type clean --target_video_type clutter \
--with_lite --num_lite_samples 16 --batch_size 256 \
Simple CNAPs+LITE. Our implementation of the model-based few-shot learner Simple CNAPs (Bateni et al., CVPR 2020) is trained with LITE on a Tesla V100 32GB GPU (see Table 1):
python3 single-step-learner.py --data_path folder/to/save/dataset/orbit_benchmark_224 \
--feature_extractor efficientnet_b0 \
--classifier mahalanobis --adapt_features \
--context_video_type clean --target_video_type clutter \
--with_lite --num_lite_samples 16 --batch_size 256 \
ProtoNets+LITE. Our implementation of the metric-based few-shot learner ProtoNets (Snell et al., NeurIPS 2017) is trained with LITE on a Tesla V100 32GB GPU (see Table 1):
python3 single-step-learner.py --data_path folder/to/save/dataset/orbit_benchmark_224 \
--feature_extractor efficientnet_b0 \
--classifier proto --learn_extractor \
--context_video_type clean --target_video_type clutter \
--with_lite --num_lite_samples 16 --batch_size 256
FineTuner.
Given the recent strong performance of finetuning-based few-shot learners, we also provide a finetuning baseline. Here, we simply freeze a pre-trained feature extractor and, using a task's support set, we finetune either i) a linear head, or i) a linear head and FiLM layers (Perez et al., 2017) in the feature extractor (see Table 1). In principle, you could also use a meta-trained checkpoint as an initialization through the --model_path
argument.
python3 multi-step-learner.py --data_path folder/to/save/dataset/orbit_benchmark_224 \
--feature_extractor efficientnet_b0 \
--mode test \ # train_test not supported
--classifier linear \
--context_video_type clean --target_video_type clutter \
--personalize_num_grad_steps 50 --personalize_learning_rate 0.001 --personalize_optimizer adam \
--batch_size 1024
Note, we have removed support for further training the feature extractor on the ORBIT train users using standard supervised learning with the objects' broader cluster labels. Please roll back to this commit if you would like to do this. The object clusters can be found in data/orbit_{train,validation,test}_object_clusters_labels.json
and data/object_clusters_benchmark.txt
.
MAML. Our implementation of MAML (Finn et al., ICML 2017) is no longer supported. Please roll back to this commit if you need to reproduce the MAML baselines in Table 5 (dataset paper) or Table 1 (LITE paper).
84x84 images. Training/testing on 84x84 images is no longer supported. Please roll back to this commit if you need to reproduce the original baselines in Table 5 (dataset paper).
The GPU memory requirements can be reduced by:
efficientnet_b0
).--with_lite
flag. Memory can be further saved by lowering --num_lite_samples
.batch_size
. This is relevant for all baselines (trained with/without LITE).--clip_length
argument.--train_context_clip_method
, --train_target_clip_method
, or --test_context_clip_method
arguments to random
/random_200
/uniform
rather than max
.The CPU memory requirements can be reduced by:
num_workers
in data/queues.py
).The following checkpoints have been trained on the ORBIT train users using the arguments specified above. The models can be run in test-only mode using the same arguments as above except adding --mode test
and providing the path to the checkpoint as --model_path path/to/checkpoint.pt
. In principle, the memory required for testing should be significantly less than training so should be possible on 1x 12-16GB GPU (or CPU with --gpu -1
). The --batch_size
flag can be used to further reduce memory requirements.
Model | Frame size | Feature extractor | Trained with LITE | Frame Accuracy (95% c.i) | Trained with clean/clutter (context/target) videos |
---|---|---|---|---|---|
CNAPs | 224 | EfficientNet-B0 | Y | 67.68 (0.58) | orbit_cluve_cnaps_efficientnet_b0_224_lite.pth |
224 | ViT-B-32-CLIP | Y | 72.33 (0.54) | orbit_cluve_cnaps_vit_b_32_clip_224_lite.pth |
|
SimpleCNAPs | 224 | EfficientNet-B0 | Y | 66.83 (0.60) | orbit_cluve_simple_cnaps_efficientnet_b0_224_lite.pth |
224 | ViT-B-32-CLIP | Y | 68.86 (0.56) | orbit_cluve_simple_cnaps_vit_b_32_clip_224_lite.pth |
|
ProtoNets | 224 | EfficientNet-B0 | Y | 67.91 (0.56) | orbit_cluve_protonets_efficientnet_b0_224_lite.pth |
224 | EfficientNet-V2-S | Y | 72.76 (0.53) | orbit_cluve_protonets_efficientnet_v2_s_224_lite.pth |
|
224 | ViT-B-32 | Y | 73.53 (0.51) | orbit_cluve_protonets_vit_b_32_224_lite.pth |
|
224 | ViT-B-32-CLIP | Y | 73.95 (0.52) | orbit_cluve_protonets_vit_b_32_clip_224_lite.pth |
|
ProtoNets (cosine) | 224 | EfficientNet-B0 | Y | 67.48 (0.57) | orbit_cluve_protonets_cosine_efficientnet_b0_224_lite.pth |
224 | EfficientNet-V2-S | Y | 73.10 (0.54) | orbit_cluve_protonets_cosine_efficientnet_v2_s_224_lite.pth |
|
224 | ViT-B-32 | Y | 75.38 (0.51) | orbit_cluve_protonets_cosine_vit_b_32_224_lite.pth |
|
224 | ViT-B-32-CLIP | Y | 73.54 (0.52) | orbit_cluve_protonets_cosine_vit_b_32_clip_224_lite.pth |
|
FineTuner | 224 | EfficientNet-B0 | N | 64.57 (0.56) | Used pre-trained extractor |
224 | ViT-B-32-CLIP | N | 71.31 (0.55) | Used pre-trained extractor | |
FineTuner + FiLM | 224 | EfficientNet-B0 | N | 66.63 (0.58) | Used pre-trained extractor |
224 | ViT-B-32-CLIP | N | 71.86 (0.55) | Used pre-trained extractor |
The VizWiz workshop is hosting the ORBIT Few-Shot Object Recognition Challenge at CVPR 2023. The Challenge will run from 12 January 2023 9am CT to Friday 5 May 2023 9am CT.
To participate, visit the Challenge evaluation server which is hosted on EvalAI. Here you will find all details about the Challenge, including the competition rules and how to register your team. The winning team will be invited to give an in-person or virtual talk at the VizWiz workshop at CVPR 2023. They will also be invited to a virtual event hosted by Microsoft Research in July 2023 to share their work with a wide audience of internal research and product teams working in machine learning and computer vision - a great opportunity to gain industry visibility for future job applicants.
We have provided orbit_challenge_getting_started.ipynb to help get you started. This starter task will step you through how to load the ORBIT validation set, run it through a pre-trained model, and save the results which you can then upload to the evaluation server.
For any questions, please email [email protected].
We provide additional annotations for the ORBIT benchmark dataset in data/orbit_extra_annotations.zip
. The annotations include per-frame bounding boxes for all clutter videos, and per-frame quality issues for all clean videos. Please read below for further details.
annotations
folder in the root dataset directory (e.g. path/to/orbit_benchmark_224/annotations/{train,validation,test}
).P177--bag--clutter--Zj_1HvmNWejSbmYf_m4YzxHhSUUl-ckBtQ-GSThX_4E.json
) which contains keys that correspond to all frames in that video (e.g. {"P177--bag--clutter--Zj_1HvmNWejSbmYf_m4YzxHhSUUl-ckBtQ-GSThX_4E-00001.jpg": {frame annotations}, "P177--bag--clutter--Zj_1HvmNWejSbmYf_m4YzxHhSUUl-ckBtQ-GSThX_4E-00002.jpg": {frame annotations}, ...}
.object_not_present_issue
.We provide per-frame bounding boxes for all clutter videos. Note, there is one bounding box per frame (i.e. the location of the labelled/target object). Other details:
{"P177--bag--clutter--Zj_1HvmNWejSbmYf_m4YzxHhSUUl-ckBtQ-GSThX_4E-00001.jpg": {"object_bounding_box": {"x": int, "y": int, "w": int, "h": int}, "object_not_present_issue": false}, ...}
where (0,0)
is the top left corner of the bounding box. The coordinates are given for the original 1080x1080 frames, thus x
and y
range from [0,1079]
, and width
and height
from [1,1080]
.{"object_bounding_box": null, "object_not_present_issue": true}
.We provide per-frame quality issues for all clean videos. Note, a frame can contain any/multiple of the following 7 issues: object_not_present_issue
, framing_issue
, viewpoint_issue
, blur_issue
, occlusion_issue
, overexposed_issue
, underexposed_issue
. The choice of issues was informed by Chiu et al., 2020. Other details:
{"P177--bag--clean--035eFoVeNqX_d86Vb5rpcNwmk6wIWA0_3ndlrwI6OZU-00001.jpg": {"object_not_present_issue": false, "framing_issue": true, "viewpoint_issue": true, "blur_issue": false, "occlusion_issue": false, "overexposed_issue": false, "underexposed_issue": false}, ...}
.{"object_not_present_issue": true, "framing_issue": null, "viewpoint_issue": null, "blur_issue": null, "occlusion_issue": null, "overexposed_issue": null, "underexposed_issue": null}
.You can use --annotations_to_load
to load the bounding box and quality issue annotations. The argument can take any/multiple of the following: object_bounding_box
, object_not_present_issue
, framing_issue
, viewpoint_issue
, blur_issue
, occlusion_issue
, overexposed_issue
, underexposed_issue
. The specified annotations will be loaded and returned in a dictionary with the task data (note, if a frame does not have one of the specified annotations then nan
will appear in its place). At present, the code does not use these annotations for training/testing. To do so, you will need to return them in the unpack_task
function in utils/data.py
.
If you would like to filter tasks' context or target sets by specific quality annotations (e.g. remove all frames with no object present), you can use --train_filter_context
/--tain_filter_target
to filter train tasks, or --test_filter_context
/--test_filter_target
to filter validation/test tasks. These arguments accept the same options as above. The filtering is applied to all context/target videos when the data loader is created (see load_all_users
in data/dataset.py
).
Some collectors/objects/videos did not meet the minimum requirement to be included in the ORBIT benchmark dataset. The full unfiltered ORBIT dataset of 4733 videos (frame size: 1080x1080) of 588 objects can be downloaded and saved to folder/to/save/dataset/orbit_unfiltered
by running the following script
bash scripts/download_unfiltered_dataset.sh folder/to/save/dataset
Alternatively, the train/validation/test/other ZIPs can be manually downloaded here. Use scripts/merge_and_split_benchmark_users.py
to merge the other folder (see script for usage details).
To summarize and plot the unfiltered dataset, use scripts/summarize_dataset.py
(with --no_modes
rather than --combine_modes
) and scripts/plot_dataset.ipynb
(with no_modes=True
) similar to above.
@article{bronskill2021lite,
title={{Memory Efficient Meta-Learning with Large Images}},
author={Bronskill, John and Massiceti, Daniela and Patacchiola, Massimiliano and Hofmann, Katja and Nowozin, Sebastian and Turner, Richard E.},
journal={Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS)},
year={2021}}
@inproceedings{massiceti2021orbit,
title={{ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition}},
author={Massiceti, Daniela and Zintgraf, Luisa and Bronskill, John and Theodorou, Lida and Harris, Matthew Tobias and Cutrell, Edward and Morrison, Cecily and Hofmann, Katja and Stumpf, Simone},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2021}}
To ask questions or report issues, please open an issue on the Issues tab.
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.