ir_datasets
: CORD-19Collection of scientific articles related to COVID-19.
Uses the 2020-07-16 version of the dataset, corresponding to the "complete" collection used for TREC COVID.
Note that this version of the document collection only provides article meta-data. To get the full text, use cord19/fulltext.
Version of cord19 dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data.
Language: en
Example
import ir_datasets
dataset = ir_datasets.load('cord19/fulltext')
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, title, doi, date, abstract, body>
Version of cord19/trec-covid dataset that includes article full texts. This dataset takes longer to load than the version that only includes article meata-data.
Queries and qrels are the same as cord19/trec-covid; it just uses the extended documents from cord19/fulltext.
Language: en
Example
import ir_datasets
dataset = ir_datasets.load('cord19/fulltext/trec-covid')
for query in dataset.queries_iter():
query # namedtuple<query_id, title, description, narrative>
The TREC COVID collection. Queries related to COVID-19, including deep relevance judgments.