BRIGHT (benchmark suite)

`"bright"`

BRIGHT is a retrieval benchmark in which finding relevant documents requires reasoning rather than surface-level lexical or semantic matching. It spans 12 diverse domains drawn from sources such as StackExchange, coding problems (LeetCode), and math competitions (AoPS, TheoremQA).

Each domain is available as a separate subset. The base subsets use short documents; the -long subsets provide the original long-form documents with relevance judgments mapped accordingly. Queries include the original reasoning rationale, gold answer, and LLM-generated reasoning fields (from Gemini, Claude 3 Opus, GPT-4, GRIT, and Llama3-70B).

Documents: Domain-specific passages (web posts, documentation, problems, etc.)
Queries: Reasoning-intensive natural language questions
Dataset Paper
GitHub
Website

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

`"bright/aops"`

Math competition problems from the Art of Problem Solving (AoPS) forum, where relevance requires recognizing shared problem-solving techniques.

Official evaluation measures: nDCG@10

queries

111 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/aops")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/aops queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/aops')
index_ref = pt.IndexRef.of('./indices/bright_aops') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.aops.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

188K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/aops")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/aops docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/aops')
# Index bright/aops
indexer = pt.IterDictIndexer('./indices/bright_aops', meta={"docno": 62})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.aops')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

623K qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`623K`	99.9%
1	Relevant	`524`	0.1%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/aops")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/aops qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/aops')
index_ref = pt.IndexRef.of('./indices/bright_aops') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.aops.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 188002,
    "fields": {
      "doc_id": {
        "max_len": 62,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 111
  },
  "qrels": {
    "count": 623280,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 524,
          "-100": 622756
        }
      }
    }
  }
}

`"bright/biology"`

Reasoning-intensive retrieval over biology content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries

103 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/biology")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/biology queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology')
index_ref = pt.IndexRef.of('./indices/bright_biology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.biology.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

57K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/biology")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/biology docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology')
# Index bright/biology
indexer = pt.IterDictIndexer('./indices/bright_biology', meta={"docno": 149})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.biology')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

372 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`372`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/biology")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/biology qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/biology')
index_ref = pt.IndexRef.of('./indices/bright_biology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.biology.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 57359,
    "fields": {
      "doc_id": {
        "max_len": 149,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 103
  },
  "qrels": {
    "count": 372,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 372
        }
      }
    }
  }
}

`"bright/biology-long"`

Long-document variant of bright/biology, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

103 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/biology-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/biology-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology-long')
index_ref = pt.IndexRef.of('./indices/bright_biology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.biology-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

524 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/biology-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/biology-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology-long')
# Index bright/biology-long
indexer = pt.IterDictIndexer('./indices/bright_biology-long', meta={"docno": 144})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.biology-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

134 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`134`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/biology-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/biology-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/biology-long')
index_ref = pt.IndexRef.of('./indices/bright_biology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.biology-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 524,
    "fields": {
      "doc_id": {
        "max_len": 144,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 103
  },
  "qrels": {
    "count": 134,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 134
        }
      }
    }
  }
}

`"bright/earth-science"`

Reasoning-intensive retrieval over earth science content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries

116 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/earth-science")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/earth-science queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science')
index_ref = pt.IndexRef.of('./indices/bright_earth-science') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.earth-science.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

121K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/earth-science")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/earth-science docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science')
# Index bright/earth-science
indexer = pt.IterDictIndexer('./indices/bright_earth-science', meta={"docno": 145})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.earth-science')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

585 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`585`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/earth-science")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/earth-science qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science')
index_ref = pt.IndexRef.of('./indices/bright_earth-science') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.earth-science.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 121249,
    "fields": {
      "doc_id": {
        "max_len": 145,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 116
  },
  "qrels": {
    "count": 585,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 585
        }
      }
    }
  }
}

`"bright/earth-science-long"`

Long-document variant of bright/earth-science, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

116 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/earth-science-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/earth-science-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science-long')
index_ref = pt.IndexRef.of('./indices/bright_earth-science-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.earth-science-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

601 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/earth-science-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/earth-science-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science-long')
# Index bright/earth-science-long
indexer = pt.IterDictIndexer('./indices/bright_earth-science-long', meta={"docno": 142})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.earth-science-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

187 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`187`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/earth-science-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/earth-science-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science-long')
index_ref = pt.IndexRef.of('./indices/bright_earth-science-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.earth-science-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 601,
    "fields": {
      "doc_id": {
        "max_len": 142,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 116
  },
  "qrels": {
    "count": 187,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 187
        }
      }
    }
  }
}

`"bright/economics"`

Reasoning-intensive retrieval over economics content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries

103 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/economics")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/economics queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics')
index_ref = pt.IndexRef.of('./indices/bright_economics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.economics.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

50K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/economics")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/economics docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics')
# Index bright/economics
indexer = pt.IterDictIndexer('./indices/bright_economics', meta={"docno": 156})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.economics')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

800 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`800`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/economics")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/economics qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/economics')
index_ref = pt.IndexRef.of('./indices/bright_economics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.economics.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 50220,
    "fields": {
      "doc_id": {
        "max_len": 156,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 103
  },
  "qrels": {
    "count": 800,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 800
        }
      }
    }
  }
}

`"bright/economics-long"`

Long-document variant of bright/economics, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

103 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/economics-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/economics-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics-long')
index_ref = pt.IndexRef.of('./indices/bright_economics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.economics-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

516 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/economics-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/economics-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics-long')
# Index bright/economics-long
indexer = pt.IterDictIndexer('./indices/bright_economics-long', meta={"docno": 152})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.economics-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

109 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`109`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/economics-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/economics-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/economics-long')
index_ref = pt.IndexRef.of('./indices/bright_economics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.economics-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 516,
    "fields": {
      "doc_id": {
        "max_len": 152,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 103
  },
  "qrels": {
    "count": 109,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 109
        }
      }
    }
  }
}

`"bright/leetcode"`

Coding problems from LeetCode, where relevant documents share the underlying algorithmic approach needed to solve the query problem.

Official evaluation measures: nDCG@10

queries

142 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/leetcode")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/leetcode queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/leetcode')
index_ref = pt.IndexRef.of('./indices/bright_leetcode') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.leetcode.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

414K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/leetcode")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/leetcode docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/leetcode')
# Index bright/leetcode
indexer = pt.IterDictIndexer('./indices/bright_leetcode', meta={"docno": 36})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.leetcode')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

34K qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`33K`	99.2%
1	Relevant	`262`	0.8%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/leetcode")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/leetcode qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/leetcode')
index_ref = pt.IndexRef.of('./indices/bright_leetcode') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.leetcode.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 413932,
    "fields": {
      "doc_id": {
        "max_len": 36,
        "common_prefix": "leetcode/"
      }
    }
  },
  "queries": {
    "count": 142
  },
  "qrels": {
    "count": 33747,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 262,
          "-100": 33485
        }
      }
    }
  }
}

`"bright/pony"`

Retrieval over documentation for the Pony programming language.

Official evaluation measures: nDCG@10

queries

112 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/pony")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/pony queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony')
index_ref = pt.IndexRef.of('./indices/bright_pony') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.pony.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

7.9K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/pony")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/pony docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony')
# Index bright/pony
indexer = pt.IterDictIndexer('./indices/bright_pony', meta={"docno": 60})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.pony')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

2.2K qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`2.2K`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/pony")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/pony qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/pony')
index_ref = pt.IndexRef.of('./indices/bright_pony') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.pony.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 7894,
    "fields": {
      "doc_id": {
        "max_len": 60,
        "common_prefix": "Pony/"
      }
    }
  },
  "queries": {
    "count": 112
  },
  "qrels": {
    "count": 2219,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 2219
        }
      }
    }
  }
}

`"bright/pony-long"`

Long-document variant of bright/pony, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

112 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/pony-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/pony-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony-long')
index_ref = pt.IndexRef.of('./indices/bright_pony-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.pony-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

577 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/pony-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/pony-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony-long')
# Index bright/pony-long
indexer = pt.IterDictIndexer('./indices/bright_pony-long', meta={"docno": 58})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.pony-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

769 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`769`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/pony-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/pony-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/pony-long')
index_ref = pt.IndexRef.of('./indices/bright_pony-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.pony-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 577,
    "fields": {
      "doc_id": {
        "max_len": 58,
        "common_prefix": "Pony/"
      }
    }
  },
  "queries": {
    "count": 112
  },
  "qrels": {
    "count": 769,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 769
        }
      }
    }
  }
}

`"bright/psychology"`

Reasoning-intensive retrieval over psychology content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries

101 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/psychology")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/psychology queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology')
index_ref = pt.IndexRef.of('./indices/bright_psychology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.psychology.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

53K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/psychology")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/psychology docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology')
# Index bright/psychology
indexer = pt.IterDictIndexer('./indices/bright_psychology', meta={"docno": 165})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.psychology')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

692 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`692`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/psychology")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/psychology qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/psychology')
index_ref = pt.IndexRef.of('./indices/bright_psychology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.psychology.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 52835,
    "fields": {
      "doc_id": {
        "max_len": 165,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 101
  },
  "qrels": {
    "count": 692,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 692
        }
      }
    }
  }
}

`"bright/psychology-long"`

Long-document variant of bright/psychology, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

101 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/psychology-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/psychology-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology-long')
index_ref = pt.IndexRef.of('./indices/bright_psychology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.psychology-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

512 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/psychology-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/psychology-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology-long')
# Index bright/psychology-long
indexer = pt.IterDictIndexer('./indices/bright_psychology-long', meta={"docno": 162})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.psychology-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

116 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`116`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/psychology-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/psychology-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/psychology-long')
index_ref = pt.IndexRef.of('./indices/bright_psychology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.psychology-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 512,
    "fields": {
      "doc_id": {
        "max_len": 162,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 101
  },
  "qrels": {
    "count": 116,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 116
        }
      }
    }
  }
}

`"bright/robotics"`

Reasoning-intensive retrieval over robotics content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries

101 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/robotics")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/robotics queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics')
index_ref = pt.IndexRef.of('./indices/bright_robotics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.robotics.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

62K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/robotics")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/robotics docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics')
# Index bright/robotics
indexer = pt.IterDictIndexer('./indices/bright_robotics', meta={"docno": 54})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.robotics')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

520 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`520`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/robotics")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/robotics qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/robotics')
index_ref = pt.IndexRef.of('./indices/bright_robotics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.robotics.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 61961,
    "fields": {
      "doc_id": {
        "max_len": 54,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 101
  },
  "qrels": {
    "count": 520,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 520
        }
      }
    }
  }
}

`"bright/robotics-long"`

Long-document variant of bright/robotics, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

101 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/robotics-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/robotics-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics-long')
index_ref = pt.IndexRef.of('./indices/bright_robotics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.robotics-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

508 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/robotics-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/robotics-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics-long')
# Index bright/robotics-long
indexer = pt.IterDictIndexer('./indices/bright_robotics-long', meta={"docno": 50})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.robotics-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

106 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`106`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/robotics-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/robotics-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/robotics-long')
index_ref = pt.IndexRef.of('./indices/bright_robotics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.robotics-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 508,
    "fields": {
      "doc_id": {
        "max_len": 50,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 101
  },
  "qrels": {
    "count": 106,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 106
        }
      }
    }
  }
}

`"bright/stackoverflow"`

Reasoning-intensive retrieval over programming content sourced from StackOverflow.

Official evaluation measures: nDCG@10

queries

117 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/stackoverflow queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.stackoverflow.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

107K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/stackoverflow docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow')
# Index bright/stackoverflow
indexer = pt.IterDictIndexer('./indices/bright_stackoverflow', meta={"docno": 145})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.stackoverflow')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

478 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`478`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/stackoverflow qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.stackoverflow.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 107081,
    "fields": {
      "doc_id": {
        "max_len": 145,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 117
  },
  "qrels": {
    "count": 478,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 478
        }
      }
    }
  }
}

`"bright/stackoverflow-long"`

Long-document variant of bright/stackoverflow, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

117 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/stackoverflow-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow-long')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.stackoverflow-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

1.9K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/stackoverflow-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow-long')
# Index bright/stackoverflow-long
indexer = pt.IterDictIndexer('./indices/bright_stackoverflow-long', meta={"docno": 140})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.stackoverflow-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

129 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`129`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/stackoverflow-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow-long')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.stackoverflow-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 1858,
    "fields": {
      "doc_id": {
        "max_len": 140,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 117
  },
  "qrels": {
    "count": 129,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 129
        }
      }
    }
  }
}

`"bright/sustainable-living"`

Reasoning-intensive retrieval over sustainable living content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries

108 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/sustainable-living queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.sustainable-living.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

61K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/sustainable-living docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living')
# Index bright/sustainable-living
indexer = pt.IterDictIndexer('./indices/bright_sustainable-living', meta={"docno": 260})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.sustainable-living')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

576 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`576`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/sustainable-living qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.sustainable-living.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 60792,
    "fields": {
      "doc_id": {
        "max_len": 260,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 108
  },
  "qrels": {
    "count": 576,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 576
        }
      }
    }
  }
}

`"bright/sustainable-living-long"`

Long-document variant of bright/sustainable-living, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries

108 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/sustainable-living-long queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living-long')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.sustainable-living-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

554 docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/sustainable-living-long docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living-long')
# Index bright/sustainable-living-long
indexer = pt.IterDictIndexer('./indices/bright_sustainable-living-long', meta={"docno": 257})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.sustainable-living-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

129 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`129`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/sustainable-living-long qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living-long')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.sustainable-living-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 554,
    "fields": {
      "doc_id": {
        "max_len": 257,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 108
  },
  "qrels": {
    "count": 129,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 129
        }
      }
    }
  }
}

`"bright/theoremqa-questions"`

TheoremQA questions, where relevance requires retrieving questions that rely on the same underlying theorem.

Official evaluation measures: nDCG@10

queries

194 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-questions")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/theoremqa-questions queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-questions')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-questions') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.theoremqa-questions.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

188K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-questions")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/theoremqa-questions docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-questions')
# Index bright/theoremqa-questions
indexer = pt.IterDictIndexer('./indices/bright_theoremqa-questions', meta={"docno": 62})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.theoremqa-questions')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

607K qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`607K`	99.9%
1	Relevant	`617`	0.1%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-questions")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/theoremqa-questions qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-questions')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-questions') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.theoremqa-questions.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 188002,
    "fields": {
      "doc_id": {
        "max_len": 62,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 194
  },
  "qrels": {
    "count": 607140,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 617,
          "-100": 606523
        }
      }
    }
  }
}

`"bright/theoremqa-theorems"`

TheoremQA theorems, where relevance requires retrieving the theorem(s) needed to answer the query.

Official evaluation measures: nDCG@10

queries

76 queries

Language: en

Query type:

BrightQuery: (namedtuple)

query_id: str
text: str
reasoning: str
gold_answer: str
gemini_1_0_reason: str
claude_3_opus_reason: str
gpt4_reason: str
grit_reason: str
llama3_70b_reason: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-theorems")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI

ir_datasets export bright/theoremqa-theorems queries



[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-theorems')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-theorems') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR

from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.theoremqa-theorems.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs

24K docs

Language: en

Document type:

GenericDoc: (namedtuple)

doc_id: str
text: str

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-theorems")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI

ir_datasets export bright/theoremqa-theorems docs



[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-theorems')
# Index bright/theoremqa-theorems
indexer = pt.IterDictIndexer('./indices/bright_theoremqa-theorems')
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR

from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.theoremqa-theorems')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels

151 qrels

Query relevance judgment type:

TrecQrel: (namedtuple)

query_id: str
doc_id: str
relevance: int
iteration: str

Relevance levels

Rel.	Definition	Count	%
-100	Excluded from evaluation	`0`	0.0%
1	Relevant	`151`	100.0%

Examples:

Python API

import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-theorems")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI

ir_datasets export bright/theoremqa-theorems qrels --format tsv



[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier

import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-theorems')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-theorems') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR

from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.theoremqa-theorems.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

Metadata

{
  "docs": {
    "count": 23839,
    "fields": {
      "doc_id": {
        "max_len": 5,
        "common_prefix": ""
      }
    }
  },
  "queries": {
    "count": 76
  },
  "qrels": {
    "count": 151,
    "fields": {
      "relevance": {
        "counts_by_value": {
          "1": 151
        }
      }
    }
  }
}

ir_datasets: BRIGHT (benchmark suite)

"bright"

"bright/aops"

"bright/biology"

"bright/biology-long"

"bright/earth-science"

"bright/earth-science-long"

"bright/economics"

"bright/economics-long"

"bright/leetcode"

"bright/pony"

"bright/pony-long"

"bright/psychology"

"bright/psychology-long"

"bright/robotics"

"bright/robotics-long"

"bright/stackoverflow"

"bright/stackoverflow-long"

"bright/sustainable-living"

"bright/sustainable-living-long"

"bright/theoremqa-questions"

"bright/theoremqa-theorems"

`ir_datasets`: BRIGHT (benchmark suite)

`"bright"`

`"bright/aops"`

`"bright/biology"`

`"bright/biology-long"`

`"bright/earth-science"`

`"bright/earth-science-long"`

`"bright/economics"`

`"bright/economics-long"`

`"bright/leetcode"`

`"bright/pony"`

`"bright/pony-long"`

`"bright/psychology"`

`"bright/psychology-long"`

`"bright/robotics"`

`"bright/robotics-long"`

`"bright/stackoverflow"`

`"bright/stackoverflow-long"`

`"bright/sustainable-living"`

`"bright/sustainable-living-long"`

`"bright/theoremqa-questions"`

`"bright/theoremqa-theorems"`