← home
Github: datasets/trec_robust04.py

ir_datasets: TREC Robust 2004

Index
  1. trec-robust04
  2. trec-robust04/fold1
  3. trec-robust04/fold2
  4. trec-robust04/fold3
  5. trec-robust04/fold4
  6. trec-robust04/fold5

"trec-robust04"

The TREC Robust retrieval task focuses on "improving the consistency of retrieval technology by focusing on poorly performing topics."

The TREC Robust document collection is from TREC disks 4 and 5. Due to the copyrighted nature of the documents, this collection is for research use only, which requires agreements to be filed with NIST. See details here.

queries

Language: en

Query type:
TrecQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. description: str
  4. narrative: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04')
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, description, narrative>
docs

Language: en

Document type:
TrecDoc: (namedtuple)
  1. doc_id: str
  2. text: str
  3. marked_up_doc: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04')
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text, marked_up_doc>
qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.Definition
0not relevant
1relevant
2highly relevant

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04')
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>
Citation
bibtex: @inproceedings{Voorhees2004OverviewRobust, title={Overview of the TREC 2004 Robust Retrieval Track}, author={Ellen Voorhees}, booktitle={TREC}, year={2004} }

"trec-robust04/fold1"

Fold 1 used in various works

queries

Language: en

Query type:
TrecQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. description: str
  4. narrative: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold1')
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, description, narrative>
docs

Language: en

Document type:
TrecDoc: (namedtuple)
  1. doc_id: str
  2. text: str
  3. marked_up_doc: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold1')
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text, marked_up_doc>
qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.Definition
0not relevant
1relevant
2highly relevant

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold1')
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

"trec-robust04/fold2"

Fold 2 used in various works

queries

Language: en

Query type:
TrecQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. description: str
  4. narrative: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold2')
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, description, narrative>
docs

Language: en

Document type:
TrecDoc: (namedtuple)
  1. doc_id: str
  2. text: str
  3. marked_up_doc: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold2')
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text, marked_up_doc>
qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.Definition
0not relevant
1relevant
2highly relevant

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold2')
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

"trec-robust04/fold3"

Fold 3 used in various works

queries

Language: en

Query type:
TrecQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. description: str
  4. narrative: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold3')
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, description, narrative>
docs

Language: en

Document type:
TrecDoc: (namedtuple)
  1. doc_id: str
  2. text: str
  3. marked_up_doc: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold3')
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text, marked_up_doc>
qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.Definition
0not relevant
1relevant
2highly relevant

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold3')
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

"trec-robust04/fold4"

Fold 4 used in various works

queries

Language: en

Query type:
TrecQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. description: str
  4. narrative: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold4')
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, description, narrative>
docs

Language: en

Document type:
TrecDoc: (namedtuple)
  1. doc_id: str
  2. text: str
  3. marked_up_doc: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold4')
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text, marked_up_doc>
qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.Definition
0not relevant
1relevant
2highly relevant

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold4')
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>

"trec-robust04/fold5"

Fold 5 used in various works

queries

Language: en

Query type:
TrecQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. description: str
  4. narrative: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold5')
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, description, narrative>
docs

Language: en

Document type:
TrecDoc: (namedtuple)
  1. doc_id: str
  2. text: str
  3. marked_up_doc: str

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold5')
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text, marked_up_doc>
qrels
Query relevance judgment type:
TrecQrel: (namedtuple)
  1. query_id: str
  2. doc_id: str
  3. relevance: int
  4. iteration: str

Relevance levels

Rel.Definition
0not relevant
1relevant
2highly relevant

Example

import ir_datasets
dataset = ir_datasets.load('trec-robust04/fold5')
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>