ir_datasets
: Beir (benchmark suite)Bibtex:
@article{Thakur2021Beir, title = "BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models", author = "Thakur, Nandan and Reimers, Nils and Rücklé, Andreas and Srivastava, Abhishek and Gurevych, Iryna", journal= "arXiv preprint arXiv:2104.08663", month = "4", year = "2021", url = "https://arxiv.org/abs/2104.08663", }A version of the ArguAna Counterargs dataset, for argument retrieval.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/arguana")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CLIMATE-FEVER dataset, for fact verification on claims about climate.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/climate-fever")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the android StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/android")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the english StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/english")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the gaming StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/gaming")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the gis StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/gis")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the mathematica StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/mathematica")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the physics StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/physics")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the programmers StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/programmers")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the stats StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/stats")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the tex StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/tex")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the unix StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/unix")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the webmasters StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/webmasters")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the CQADupStack dataset, for duplicate question retrieval. This subset is from the wordpress StackExchange subforum.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/cqadupstack/wordpress")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the DBPedia-Entity-v2 dataset for entity retrieval.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/dbpedia-entity")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A random sample of 67 queries from the official test set, used as a dev set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/dbpedia-entity/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A the official test set, without 67 queries used as a dev set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/dbpedia-entity/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the FEVER dataset for fact verification. Includes queries from the /train /dev and /test subsets.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fever")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
The official dev set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fever/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
The official test set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fever/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
The official train set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fever/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the FIQA-2018 dataset (financial opinion question answering). Queries include those in the /train /dev and /test subsets.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fiqa")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Random sample of 500 queries from the official dataset.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fiqa/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Random sample of 648 queries from the official dataset.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fiqa/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Official dataset without the 1148 queries sampled for /dev and /test.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/fiqa/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the Hotpot QA dataset for multi-hop question answering. Queries include all those in /train /dev and /test.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/hotpotqa")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Random selection of the 5447 queries from /train.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/hotpotqa/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Official dev set from HotpotQA, here used as a test set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/hotpotqa/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Official train set, without the random selection of the 5447 queries used for /dev.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/hotpotqa/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the MS MARCO passage ranking dataset. Includes queries from the /train, /dev, and /test sub-datasets.
Note that this version differs from msmarco-passage, in that it does not correct the encoding problems in the source documents.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/msmarco")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the MS MARCO passage ranking dev set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/msmarco/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the TREC Deep Learning 2019 set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/msmarco/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the MS MARCO passage ranking train set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/msmarco/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the NF Corpus (Nutrition Facts). Queries use the "title" variant of the query, which here are often natural language questions. Queries include all those from /train /dev and /test.
Data pre-processing may be different than what is done in nfcorpus.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/nfcorpus")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Combined dev set of NFCorpus.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/nfcorpus/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Combined test set of NFCorpus.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/nfcorpus/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
Combined train set of NFCorpus.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/nfcorpus/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the Natural Questions dev dataset.
Data pre-processing differs both from what is done in natural-questions and dpr-w100/natural-questions, especially with respect to the document collection and filtering conducted on the queries. See the Beir paper for details.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/nq")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the Quora duplicate question detection dataset (QQP). Includes queries from /dev and /test sets.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/quora")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A 5,000 question subset of the original dataset, without overlaps in the other subsets.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/quora/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A 10,000 question subset of the original dataset, without overlaps in the other subsets.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/quora/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the SciDocs dataset, used for citation retrieval.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/scidocs")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the SciFact dataset, for fact verification. Queries include those form the /train and /test sets.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/scifact")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
The official dev set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/scifact/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
The official train set.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/scifact/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the TREC COVID (complete) dataset, with titles and abstracts as documents. Queries are the question variant.
Data pre-processing may be different than what is done in cord19/trec-covid.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/trec-covid")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.
A version of the Touchè-2020 dataset, for argument retrieval.
Negative relevance judgments from the original dataset are replaced with 0.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("beir/webis-touche2020")
for query in dataset.queries_iter():
query # namedtuple<query_id, text, metadata>
You can find more details about the Python API here.