← home
Github: datasets/bright.py

ir_datasets: BRIGHT (benchmark suite)

Index
  1. bright
  2. bright/aops
  3. bright/biology
  4. bright/biology-long
  5. bright/earth-science
  6. bright/earth-science-long
  7. bright/economics
  8. bright/economics-long
  9. bright/leetcode
  10. bright/pony
  11. bright/pony-long
  12. bright/psychology
  13. bright/psychology-long
  14. bright/robotics
  15. bright/robotics-long
  16. bright/stackoverflow
  17. bright/stackoverflow-long
  18. bright/sustainable-living
  19. bright/sustainable-living-long
  20. bright/theoremqa-questions
  21. bright/theoremqa-theorems

"bright"

BRIGHT is a retrieval benchmark in which finding relevant documents requires reasoning rather than surface-level lexical or semantic matching. It spans 12 diverse domains drawn from sources such as StackExchange, coding problems (LeetCode), and math competitions (AoPS, TheoremQA).

Each domain is available as a separate subset. The base subsets use short documents; the -long subsets provide the original long-form documents with relevance judgments mapped accordingly. Queries include the original reasoning rationale, gold answer, and LLM-generated reasoning fields (from Gemini, Claude 3 Opus, GPT-4, GRIT, and Llama3-70B).

  • Documents: Domain-specific passages (web posts, documentation, problems, etc.)
  • Queries: Reasoning-intensive natural language questions
  • Dataset Paper
  • GitHub
  • Website
Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

"bright/aops"

Math competition problems from the Art of Problem Solving (AoPS) forum, where relevance requires recognizing shared problem-solving techniques.

Official evaluation measures: nDCG@10

queries
111 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/aops")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/aops queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/aops')
index_ref = pt.IndexRef.of('./indices/bright_aops') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.aops.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
188K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/aops")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/aops docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/aops')
# Index bright/aops
indexer = pt.IterDictIndexer('./indices/bright_aops', meta={"docno": 62})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.aops')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
623K qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation623K99.9%
1Relevant524 0.1%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/aops")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/aops qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/aops')
index_ref = pt.IndexRef.of('./indices/bright_aops') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.aops.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/biology"

Reasoning-intensive retrieval over biology content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries
103 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/biology")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/biology queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology')
index_ref = pt.IndexRef.of('./indices/bright_biology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.biology.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
57K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/biology")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/biology docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology')
# Index bright/biology
indexer = pt.IterDictIndexer('./indices/bright_biology', meta={"docno": 149})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.biology')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
372 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant372 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/biology")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/biology qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/biology')
index_ref = pt.IndexRef.of('./indices/bright_biology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.biology.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/biology-long"

Long-document variant of bright/biology, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
103 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/biology-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/biology-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology-long')
index_ref = pt.IndexRef.of('./indices/bright_biology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.biology-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
524 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/biology-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/biology-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/biology-long')
# Index bright/biology-long
indexer = pt.IterDictIndexer('./indices/bright_biology-long', meta={"docno": 144})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.biology-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
134 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant134 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/biology-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/biology-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/biology-long')
index_ref = pt.IndexRef.of('./indices/bright_biology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.biology-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/earth-science"

Reasoning-intensive retrieval over earth science content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries
116 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/earth-science")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/earth-science queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science')
index_ref = pt.IndexRef.of('./indices/bright_earth-science') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.earth-science.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
121K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/earth-science")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/earth-science docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science')
# Index bright/earth-science
indexer = pt.IterDictIndexer('./indices/bright_earth-science', meta={"docno": 145})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.earth-science')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
585 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant585 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/earth-science")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/earth-science qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science')
index_ref = pt.IndexRef.of('./indices/bright_earth-science') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.earth-science.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/earth-science-long"

Long-document variant of bright/earth-science, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
116 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/earth-science-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/earth-science-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science-long')
index_ref = pt.IndexRef.of('./indices/bright_earth-science-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.earth-science-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
601 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/earth-science-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/earth-science-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science-long')
# Index bright/earth-science-long
indexer = pt.IterDictIndexer('./indices/bright_earth-science-long', meta={"docno": 142})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.earth-science-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
187 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant187 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/earth-science-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/earth-science-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/earth-science-long')
index_ref = pt.IndexRef.of('./indices/bright_earth-science-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.earth-science-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/economics"

Reasoning-intensive retrieval over economics content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries
103 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/economics")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/economics queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics')
index_ref = pt.IndexRef.of('./indices/bright_economics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.economics.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
50K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/economics")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/economics docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics')
# Index bright/economics
indexer = pt.IterDictIndexer('./indices/bright_economics', meta={"docno": 156})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.economics')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
800 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant800 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/economics")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/economics qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/economics')
index_ref = pt.IndexRef.of('./indices/bright_economics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.economics.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/economics-long"

Long-document variant of bright/economics, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
103 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/economics-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/economics-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics-long')
index_ref = pt.IndexRef.of('./indices/bright_economics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.economics-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
516 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/economics-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/economics-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/economics-long')
# Index bright/economics-long
indexer = pt.IterDictIndexer('./indices/bright_economics-long', meta={"docno": 152})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.economics-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
109 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant109 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/economics-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/economics-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/economics-long')
index_ref = pt.IndexRef.of('./indices/bright_economics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.economics-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/leetcode"

Coding problems from LeetCode, where relevant documents share the underlying algorithmic approach needed to solve the query problem.

Official evaluation measures: nDCG@10

queries
142 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/leetcode")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/leetcode queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/leetcode')
index_ref = pt.IndexRef.of('./indices/bright_leetcode') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.leetcode.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
414K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/leetcode")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/leetcode docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/leetcode')
# Index bright/leetcode
indexer = pt.IterDictIndexer('./indices/bright_leetcode', meta={"docno": 36})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.leetcode')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
34K qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation33K99.2%
1Relevant262 0.8%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/leetcode")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/leetcode qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/leetcode')
index_ref = pt.IndexRef.of('./indices/bright_leetcode') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.leetcode.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/pony"

Retrieval over documentation for the Pony programming language.

Official evaluation measures: nDCG@10

queries
112 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/pony")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/pony queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony')
index_ref = pt.IndexRef.of('./indices/bright_pony') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.pony.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
7.9K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/pony")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/pony docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony')
# Index bright/pony
indexer = pt.IterDictIndexer('./indices/bright_pony', meta={"docno": 60})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.pony')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
2.2K qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant2.2K100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/pony")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/pony qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/pony')
index_ref = pt.IndexRef.of('./indices/bright_pony') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.pony.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/pony-long"

Long-document variant of bright/pony, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
112 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/pony-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/pony-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony-long')
index_ref = pt.IndexRef.of('./indices/bright_pony-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.pony-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
577 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/pony-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/pony-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/pony-long')
# Index bright/pony-long
indexer = pt.IterDictIndexer('./indices/bright_pony-long', meta={"docno": 58})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.pony-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
769 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant769 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/pony-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/pony-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/pony-long')
index_ref = pt.IndexRef.of('./indices/bright_pony-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.pony-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/psychology"

Reasoning-intensive retrieval over psychology content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries
101 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/psychology")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/psychology queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology')
index_ref = pt.IndexRef.of('./indices/bright_psychology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.psychology.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
53K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/psychology")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/psychology docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology')
# Index bright/psychology
indexer = pt.IterDictIndexer('./indices/bright_psychology', meta={"docno": 165})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.psychology')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
692 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant692 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/psychology")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/psychology qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/psychology')
index_ref = pt.IndexRef.of('./indices/bright_psychology') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.psychology.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/psychology-long"

Long-document variant of bright/psychology, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
101 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/psychology-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/psychology-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology-long')
index_ref = pt.IndexRef.of('./indices/bright_psychology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.psychology-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
512 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/psychology-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/psychology-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/psychology-long')
# Index bright/psychology-long
indexer = pt.IterDictIndexer('./indices/bright_psychology-long', meta={"docno": 162})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.psychology-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
116 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant116 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/psychology-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/psychology-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/psychology-long')
index_ref = pt.IndexRef.of('./indices/bright_psychology-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.psychology-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/robotics"

Reasoning-intensive retrieval over robotics content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries
101 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/robotics")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/robotics queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics')
index_ref = pt.IndexRef.of('./indices/bright_robotics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.robotics.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
62K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/robotics")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/robotics docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics')
# Index bright/robotics
indexer = pt.IterDictIndexer('./indices/bright_robotics', meta={"docno": 54})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.robotics')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
520 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant520 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/robotics")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/robotics qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/robotics')
index_ref = pt.IndexRef.of('./indices/bright_robotics') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.robotics.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/robotics-long"

Long-document variant of bright/robotics, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
101 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/robotics-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/robotics-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics-long')
index_ref = pt.IndexRef.of('./indices/bright_robotics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.robotics-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
508 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/robotics-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/robotics-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/robotics-long')
# Index bright/robotics-long
indexer = pt.IterDictIndexer('./indices/bright_robotics-long', meta={"docno": 50})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.robotics-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
106 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant106 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/robotics-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/robotics-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/robotics-long')
index_ref = pt.IndexRef.of('./indices/bright_robotics-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.robotics-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/stackoverflow"

Reasoning-intensive retrieval over programming content sourced from StackOverflow.

Official evaluation measures: nDCG@10

queries
117 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/stackoverflow queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.stackoverflow.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
107K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/stackoverflow docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow')
# Index bright/stackoverflow
indexer = pt.IterDictIndexer('./indices/bright_stackoverflow', meta={"docno": 145})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.stackoverflow')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
478 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant478 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/stackoverflow qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.stackoverflow.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/stackoverflow-long"

Long-document variant of bright/stackoverflow, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
117 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/stackoverflow-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow-long')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.stackoverflow-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
1.9K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/stackoverflow-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow-long')
# Index bright/stackoverflow-long
indexer = pt.IterDictIndexer('./indices/bright_stackoverflow-long', meta={"docno": 140})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.stackoverflow-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
129 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant129 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/stackoverflow-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/stackoverflow-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/stackoverflow-long')
index_ref = pt.IndexRef.of('./indices/bright_stackoverflow-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.stackoverflow-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/sustainable-living"

Reasoning-intensive retrieval over sustainable living content sourced from StackExchange.

Official evaluation measures: nDCG@10

queries
108 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/sustainable-living queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.sustainable-living.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
61K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/sustainable-living docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living')
# Index bright/sustainable-living
indexer = pt.IterDictIndexer('./indices/bright_sustainable-living', meta={"docno": 260})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.sustainable-living')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
576 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant576 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/sustainable-living qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.sustainable-living.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/sustainable-living-long"

Long-document variant of bright/sustainable-living, retrieving the original full-length documents.

Official evaluation measures: Success@1

queries
108 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living-long")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/sustainable-living-long queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living-long')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.sustainable-living-long.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
554 docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living-long")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/sustainable-living-long docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living-long')
# Index bright/sustainable-living-long
indexer = pt.IterDictIndexer('./indices/bright_sustainable-living-long', meta={"docno": 257})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.sustainable-living-long')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
129 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant129 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/sustainable-living-long")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/sustainable-living-long qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/sustainable-living-long')
index_ref = pt.IndexRef.of('./indices/bright_sustainable-living-long') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [Success@1]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.sustainable-living-long.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/theoremqa-questions"

TheoremQA questions, where relevance requires retrieving questions that rely on the same underlying theorem.

Official evaluation measures: nDCG@10

queries
194 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-questions")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/theoremqa-questions queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-questions')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-questions') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.theoremqa-questions.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
188K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-questions")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/theoremqa-questions docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-questions')
# Index bright/theoremqa-questions
indexer = pt.IterDictIndexer('./indices/bright_theoremqa-questions', meta={"docno": 62})
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.theoremqa-questions')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
607K qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation607K99.9%
1Relevant617 0.1%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-questions")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/theoremqa-questions qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-questions')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-questions') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.theoremqa-questions.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata

"bright/theoremqa-theorems"

TheoremQA theorems, where relevance requires retrieving the theorem(s) needed to answer the query.

Official evaluation measures: nDCG@10

queries
76 queries

Language: en

Query type:
BrightQuery: (namedtuple)
  1. query_id: str
  2. text: str
  3. reasoning: str
  4. gold_answer: str
  5. gemini_1_0_reason: str
  6. claude_3_opus_reason: str
  7. gpt4_reason: str
  8. grit_reason: str
  9. llama3_70b_reason: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-theorems")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text, reasoning, gold_answer, gemini_1_0_reason, claude_3_opus_reason, gpt4_reason, grit_reason, llama3_70b_reason>

You can find more details about the Python API here.

CLI
ir_datasets export bright/theoremqa-theorems queries
[query_id]    [text]    [reasoning]    [gold_answer]    [gemini_1_0_reason]    [claude_3_opus_reason]    [gpt4_reason]    [grit_reason]    [llama3_70b_reason]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-theorems')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-theorems') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pipeline(dataset.get_topics('text'))

You can find more details about PyTerrier retrieval here.

XPM-IR
from datamaestro import prepare_dataset
topics = prepare_dataset('irds.bright.theoremqa-theorems.queries')  # AdhocTopics
for topic in topics.iter():
    print(topic)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocTopics.

docs
24K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-theorems")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.

CLI
ir_datasets export bright/theoremqa-theorems docs
[doc_id]    [text]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-theorems')
# Index bright/theoremqa-theorems
indexer = pt.IterDictIndexer('./indices/bright_theoremqa-theorems')
index_ref = indexer.index(dataset.get_corpus_iter(), fields=['text'])

You can find more details about PyTerrier indexing here.

XPM-IR
from datamaestro import prepare_dataset
dataset = prepare_dataset('irds.bright.theoremqa-theorems')
for doc in dataset.iter_documents():
    print(doc)  # an AdhocDocumentStore
    break

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocDocumentStore

qrels
151 qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.DefinitionCount%
-100Excluded from evaluation0 0.0%
1Relevant151 100.0%

Examples:

Python API
import ir_datasets
dataset = ir_datasets.load("bright/theoremqa-theorems")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

You can find more details about the Python API here.

CLI
ir_datasets export bright/theoremqa-theorems qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...

You can find more details about the CLI here.

PyTerrier
import pyterrier as pt
from pyterrier.measures import *
pt.init()
dataset = pt.get_dataset('irds:bright/theoremqa-theorems')
index_ref = pt.IndexRef.of('./indices/bright_theoremqa-theorems') # assumes you have already built an index
pipeline = pt.BatchRetrieve(index_ref, wmodel='BM25')
# (optionally other pipeline components)
pt.Experiment(
    [pipeline],
    dataset.get_topics('text'),
    dataset.get_qrels(),
    [nDCG@10]
)

You can find more details about PyTerrier experiments here.

XPM-IR
from datamaestro import prepare_dataset
qrels = prepare_dataset('irds.bright.theoremqa-theorems.qrels')  # AdhocAssessments
for topic_qrels in qrels.iter():
    print(topic_qrels)  # An AdhocTopic

This examples requires that experimaestro-ir be installed. For more information about the returned object, see the documentation about AdhocAssessments.

Citation

ir_datasets.bib:

\cite{DBLP:conf/iclr/SuYXSMWLSST0YA025}

Bibtex:

@inproceedings{DBLP:conf/iclr/SuYXSMWLSST0YA025, author = {Hongjin Su and Howard Yen and Mengzhou Xia and Weijia Shi and Niklas Muennighoff and Han{-}yu Wang and Haisu Liu and Quan Shi and Zachary S. Siegel and Michael Tang and Ruoxi Sun and Jinsung Yoon and Sercan {\"{O}}. Arik and Danqi Chen and Tao Yu}, title = {{BRIGHT:} {A} Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval}, booktitle = {The Thirteenth International Conference on Learning Representations, {ICLR} 2025, Singapore, April 24-28, 2025}, publisher = {OpenReview.net}, year = {2025}, url = {https://openreview.net/forum?id=ykuc5q381b}, timestamp = {Thu, 15 May 2025 17:19:05 +0200}, biburl = {https://dblp.org/rec/conf/iclr/SuYXSMWLSST0YA025.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Metadata