Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Fashion Mnist | 9,856 | a year ago | 24 | mit | Python | |||||
A MNIST-like fashion product database. Benchmark :point_down: | ||||||||||
Nlp_chinese_corpus | 8,344 | 5 days ago | 20 | mit | ||||||
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP | ||||||||||
Clue | 3,345 | 5 days ago | 73 | Python | ||||||
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard | ||||||||||
Benchmarking Gnns | 2,137 | 2 months ago | 5 | mit | Jupyter Notebook | |||||
Repository for benchmarking graph neural networks | ||||||||||
Deepmoji | 1,331 | a year ago | 9 | mit | Python | |||||
State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc. | ||||||||||
Codexglue | 1,124 | 8 days ago | 21 | mit | C# | |||||
CodeXGLUE | ||||||||||
Beir | 872 | 3 | a month ago | 28 | June 30, 2022 | 60 | apache-2.0 | Python | ||
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets. | ||||||||||
Tdc | 809 | 1 | 15 days ago | 26 | February 20, 2022 | 28 | mit | Jupyter Notebook | ||
Therapeutics Data Commons: Artificial Intelligence Foundation for Therapeutic Science | ||||||||||
Medmnist | 764 | a month ago | 3 | May 06, 2022 | apache-2.0 | Python | ||||
[pip install medmnist] 18 MNIST-like Datasets for 2D and 3D Biomedical Image Classification | ||||||||||
Matterport | 746 | 6 months ago | 43 | mit | C++ | |||||
Matterport3D is a pretty awesome dataset for RGB-D machine learning tasks :) |
Paper | Installation | Quick Example | Datasets | Wiki | Hugging Face
BEIR is a heterogeneous benchmark containing diverse IR tasks. It also provides a common and easy framework for evaluation of your NLP-based retrieval models within the benchmark.
For an overview, checkout our new wiki page: https://github.com/beir-cellar/beir/wiki.
For models and datasets, checkout out HuggingFace (HF) page: https://huggingface.co/BeIR.
For Leaderboard, checkout out Eval AI page: https://eval.ai/web/challenges/challenge-page/1897.
For more information, checkout out our publications:
Install via pip:
pip install beir
If you want to build from source, use:
$ git clone https://github.com/beir-cellar/beir.git
$ cd beir
$ pip install -e .
Tested with python versions 3.6 and 3.7
For other example codes, please refer to our Examples and Tutorials Wiki page.
from beir import util, LoggingHandler
from beir.retrieval import models
from beir.datasets.data_loader import GenericDataLoader
from beir.retrieval.evaluation import EvaluateRetrieval
from beir.retrieval.search.dense import DenseRetrievalExactSearch as DRES
import logging
import pathlib, os
#### Just some code to print debug information to stdout
logging.basicConfig(format='%(asctime)s - %(message)s',
datefmt='%Y-%m-%d %H:%M:%S',
level=logging.INFO,
handlers=[LoggingHandler()])
#### /print debug information to stdout
#### Download scifact.zip dataset and unzip the dataset
dataset = "scifact"
url = "https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/{}.zip".format(dataset)
out_dir = os.path.join(pathlib.Path(__file__).parent.absolute(), "datasets")
data_path = util.download_and_unzip(url, out_dir)
#### Provide the data_path where scifact has been downloaded and unzipped
corpus, queries, qrels = GenericDataLoader(data_folder=data_path).load(split="test")
#### Load the SBERT model and retrieve using cosine-similarity
model = DRES(models.SentenceBERT("msmarco-distilbert-base-tas-b"), batch_size=16)
retriever = EvaluateRetrieval(model, score_function="dot") # or "cos_sim" for cosine similarity
results = retriever.retrieve(corpus, queries)
#### Evaluate your model with [email protected], [email protected], [email protected] and [email protected] where k = [1,3,5,10,100,1000]
ndcg, _map, recall, precision = retriever.evaluate(qrels, results, retriever.k_values)
Command to generate md5hash using Terminal: md5sum filename.zip
.
You can view all datasets available here or on HuggingFace.
Dataset | Website | BEIR-Name | Public? | Type | Queries | Corpus | Rel D/Q | Down-load | md5 |
---|---|---|---|---|---|---|---|---|---|
MSMARCO | Homepage | msmarco |
train dev test
|
6,980 | 8.84M | 1.1 | Link | 444067daf65d982533ea17ebd59501e4 |
|
TREC-COVID | Homepage | trec-covid |
test |
50 | 171K | 493.5 | Link | ce62140cb23feb9becf6270d0d1fe6d1 |
|
NFCorpus | Homepage | nfcorpus |
train dev test
|
323 | 3.6K | 38.2 | Link | a89dba18a62ef92f7d323ec890a0d38d |
|
BioASQ | Homepage | bioasq |
train test
|
500 | 14.91M | 8.05 | No | How to Reproduce? | |
NQ | Homepage | nq |
train test
|
3,452 | 2.68M | 1.2 | Link | d4d3d2e48787a744b6f6e691ff534307 |
|
HotpotQA | Homepage | hotpotqa |
train dev test
|
7,405 | 5.23M | 2.0 | Link | f412724f78b0d91183a0e86805e16114 |
|
FiQA-2018 | Homepage | fiqa |
train dev test
|
648 | 57K | 2.6 | Link | 17918ed23cd04fb15047f73e6c3bd9d9 |
|
Signal-1M(RT) | Homepage | signal1m |
test |
97 | 2.86M | 19.6 | No | How to Reproduce? | |
TREC-NEWS | Homepage | trec-news |
test |
57 | 595K | 19.6 | No | How to Reproduce? | |
Robust04 | Homepage | robust04 |
test |
249 | 528K | 69.9 | No | How to Reproduce? | |
ArguAna | Homepage | arguana |
test |
1,406 | 8.67K | 1.0 | Link | 8ad3e3c2a5867cdced806d6503f29b99 |
|
Touche-2020 | Homepage | webis-touche2020 |
test |
49 | 382K | 19.0 | Link | 46f650ba5a527fc69e0a6521c5a23563 |
|
CQADupstack | Homepage | cqadupstack |
test |
13,145 | 457K | 1.4 | Link | 4e41456d7df8ee7760a7f866133bda78 |
|
Quora | Homepage | quora |
dev test
|
10,000 | 523K | 1.6 | Link | 18fb154900ba42a600f84b839c173167 |
|
DBPedia | Homepage | dbpedia-entity |
dev test
|
400 | 4.63M | 38.2 | Link | c2a39eb420a3164af735795df012ac2c |
|
SCIDOCS | Homepage | scidocs |
test |
1,000 | 25K | 4.9 | Link | 38121350fc3a4d2f48850f6aff52e4a9 |
|
FEVER | Homepage | fever |
train dev test
|
6,666 | 5.42M | 1.2 | Link | 5a818580227bfb4b35bb6fa46d9b6c03 |
|
Climate-FEVER | Homepage | climate-fever |
test |
1,535 | 5.42M | 3.0 | Link | 8b66f0a9126c521bae2bde127b4dc99d |
|
SciFact | Homepage | scifact |
train test
|
300 | 5K | 1.1 | Link | 5f7d1de60b170fc8027bb7898e2efca1 |
We also provide a variety of additional information in our Wiki page. Please refer to these pages for the following:
Similar to Tensorflow datasets or HuggingFace's datasets library, we just downloaded and prepared public datasets. We only distribute these datasets in a specific format, but we do not vouch for their quality or fairness, or claim that you have license to use the dataset. It remains the user's responsibility to determine whether you as a user have permission to use the dataset under the dataset's license and to cite the right owner of the dataset.
If you're a dataset owner and wish to update any part of it, or do not want your dataset to be included in this library, feel free to post an issue here or make a pull request!
If you're a dataset owner and wish to include your dataset or model in this library, feel free to post an issue here or make a pull request!
If you find this repository helpful, feel free to cite our publication BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models:
@inproceedings{
thakur2021beir,
title={{BEIR}: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models},
author={Nandan Thakur and Nils Reimers and Andreas R{\"u}ckl{\'e} and Abhishek Srivastava and Iryna Gurevych},
booktitle={Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)},
year={2021},
url={https://openreview.net/forum?id=wCu6T5xFjeJ}
}
The main contributors of this repository are:
Contact person: Nandan Thakur, [email protected]
Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.
This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.
The BEIR Benchmark has been made possible due to a collaborative effort of the following universities and organizations:
Thanks go to all these wonderful collaborations for their contribution towards the BEIR benchmark:
Nandan Thakur |
Nils Reimers |
![]() Iryna Gurevych |
Jimmy Lin |
![]() Andreas Rckl |
Abhishek Srivastava |