ir_datasets
: CODECCODEC Document Ranking sub-task.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codec")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
ir_datasets export codec queries
[query_id] [query] [narrative]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
Rel. | Definition | Count | % |
---|---|---|---|
0 | Not Relevant. Not useful or on topic. | 2.1K | 40.4% |
1 | Not Valuable. Consists of definitions or background. | 1.8K | 35.5% |
2 | Somewhat Valuable. Includes valuable topic-specific arguments, evidence, or knowledge. | 924 | 18.0% |
3 | Very Valuable. Includes central topic-specific arguments, evidence, or knowledge. This does not include general definitions or background. | 312 | 6.1% |
Examples:
import ir_datasets
dataset = ir_datasets.load("codec")
for qrel in dataset.qrels_iter():
qrel # namedtuple<query_id, doc_id, relevance, iteration>
You can find more details about the Python API here.
ir_datasets export codec qrels --format tsv
[query_id] [doc_id] [relevance] [iteration]
...
You can find more details about the CLI here.
No example available for PyTerrier
{ "queries": { "count": 36 }, "qrels": { "count": 5130, "fields": { "relevance": { "counts_by_value": { "0": 2075, "2": 924, "1": 1819, "3": 312 } } } } }
Subset of codec that only contains topics about economics.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codec/economics")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
ir_datasets export codec/economics queries
[query_id] [query] [narrative]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
Rel. | Definition | Count | % |
---|---|---|---|
0 | Not Relevant. Not useful or on topic. | 596 | 37.4% |
1 | Not Valuable. Consists of definitions or background. | 545 | 34.2% |
2 | Somewhat Valuable. Includes valuable topic-specific arguments, evidence, or knowledge. | 330 | 20.7% |
3 | Very Valuable. Includes central topic-specific arguments, evidence, or knowledge. This does not include general definitions or background. | 121 | 7.6% |
Examples:
import ir_datasets
dataset = ir_datasets.load("codec/economics")
for qrel in dataset.qrels_iter():
qrel # namedtuple<query_id, doc_id, relevance>
You can find more details about the Python API here.
ir_datasets export codec/economics qrels --format tsv
[query_id] [doc_id] [relevance]
...
You can find more details about the CLI here.
No example available for PyTerrier
{ "queries": { "count": 12 }, "qrels": { "count": 1592, "fields": { "relevance": { "counts_by_value": { "0": 596, "2": 330, "1": 545, "3": 121 } } } } }
Subset of codec that only contains topics about history.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codec/history")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
ir_datasets export codec/history queries
[query_id] [query] [narrative]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
Rel. | Definition | Count | % |
---|---|---|---|
0 | Not Relevant. Not useful or on topic. | 870 | 51.3% |
1 | Not Valuable. Consists of definitions or background. | 509 | 30.0% |
2 | Somewhat Valuable. Includes valuable topic-specific arguments, evidence, or knowledge. | 235 | 13.9% |
3 | Very Valuable. Includes central topic-specific arguments, evidence, or knowledge. This does not include general definitions or background. | 81 | 4.8% |
Examples:
import ir_datasets
dataset = ir_datasets.load("codec/history")
for qrel in dataset.qrels_iter():
qrel # namedtuple<query_id, doc_id, relevance>
You can find more details about the Python API here.
ir_datasets export codec/history qrels --format tsv
[query_id] [doc_id] [relevance]
...
You can find more details about the CLI here.
No example available for PyTerrier
{ "queries": { "count": 12 }, "qrels": { "count": 1695, "fields": { "relevance": { "counts_by_value": { "0": 870, "1": 509, "2": 235, "3": 81 } } } } }
Subset of codec that only contains topics about politics.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codec/politics")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
ir_datasets export codec/politics queries
[query_id] [query] [narrative]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
Rel. | Definition | Count | % |
---|---|---|---|
0 | Not Relevant. Not useful or on topic. | 609 | 33.0% |
1 | Not Valuable. Consists of definitions or background. | 765 | 41.5% |
2 | Somewhat Valuable. Includes valuable topic-specific arguments, evidence, or knowledge. | 359 | 19.5% |
3 | Very Valuable. Includes central topic-specific arguments, evidence, or knowledge. This does not include general definitions or background. | 110 | 6.0% |
Examples:
import ir_datasets
dataset = ir_datasets.load("codec/politics")
for qrel in dataset.qrels_iter():
qrel # namedtuple<query_id, doc_id, relevance>
You can find more details about the Python API here.
ir_datasets export codec/politics qrels --format tsv
[query_id] [doc_id] [relevance]
...
You can find more details about the CLI here.
No example available for PyTerrier
{ "queries": { "count": 12 }, "qrels": { "count": 1843, "fields": { "relevance": { "counts_by_value": { "1": 765, "2": 359, "0": 609, "3": 110 } } } } }