Github: allenai/ir_datasets

ir_measures & ir_datasets

ir-measures is a Python package that interfaces with several IR evaluation tools, including pytrec_eval, gdeval, trectools, and others.

To get started with ir-measures, see this guide.

Basic Usage

ir-measures accepts qrels provided by ir_datasets directly in its python API.

import ir_datasets
import ir_measures
qrels = ir_datasets.load('trec-robust04').qrels_iter()
run = ir_measures.read_trec_run('path/to/run')
ir_measures.calc_aggregate([nDCG@10, P@5, P(rel=2)@5, Judged@10], qrels, run)

{
  nDCG@10: 0.3793,
  P@5: 0.4185,
  P(rel=2)@5: 0.0803,
  Judged@10: 0.9628
}

The ir-measures CLI accepts dataset IDs from ir_datasets directly in place of the qrels file.

ir_measures trec-robust04 path/to/run 'nDCG@10 P@5 P(rel=2)@5 Judged@10'

nDCG@10 0.3793
P@5     0.4185
P(rel=2)@5      0.0803
Judged@10       0.9628

Note that if a file with the same name as the dataset ID exists, the file will be used instead.

Alternatively, you can save the qrels output of the ir_datasets export command as a file and use this file as input to ir-measures.

ir_measures & ir_datasets

Basic Usage

Further Information