ir_datasets
: TREC ArabicA collection of news articles in Arabic, used for multi-lingual evaluation in TREC 2001 and TREC 2002.
Document collection from LDC2001T55.
Language: ar
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic')
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text, marked_up_doc>
Arabic benchmark from TREC 2001.
Language: ar
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic/ar2001')
for query in dataset.queries_iter():
query # namedtuple<query_id, title, description, narrative>
Language: ar
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic/ar2001')
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text, marked_up_doc>
Relevance levels
Rel. | Definition |
---|---|
0 | not relevant |
1 | relevant |
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic/ar2001')
for qrel in dataset.qrels_iter():
qrel # namedtuple<query_id, doc_id, relevance, iteration>
Arabic benchmark from TREC 2002.
Language: ar
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic/ar2002')
for query in dataset.queries_iter():
query # namedtuple<query_id, title, description, narrative>
Language: ar
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic/ar2002')
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text, marked_up_doc>
Relevance levels
Rel. | Definition |
---|---|
0 | not relevant |
1 | relevant |
Example
import ir_datasets
dataset = ir_datasets.load('trec-arabic/ar2002')
for qrel in dataset.qrels_iter():
qrel # namedtuple<query_id, doc_id, relevance, iteration>