← home
Github: datasets/clinicaltrials.py

ir_datasets: Clinical Trials

Index
  1. clinicaltrials
  2. clinicaltrials/2017
  3. clinicaltrials/2017/trec-pm-2017
  4. clinicaltrials/2017/trec-pm-2018
  5. clinicaltrials/2019
  6. clinicaltrials/2019/trec-pm-2019
  7. clinicaltrials/2021
  8. clinicaltrials/2021/trec-ct-2021
  9. clinicaltrials/2021/trec-ct-2022

"clinicaltrials"

Clinical trial information from ClinicalTrials.gov. Used for the Clinical Trials subtasks in TREC Precision Medicine.


"clinicaltrials/2017"

A snapshot of ClinicalTrials.gov from April 2017 for use with the clinicaltrials/2017/trec-pm-2017 and clinicaltrials/2017/trec-pm-2018 Clinical Trials subtasks.

docsMetadata
241K docs

Language: en

Document type:
ClinicalTrialsDoc: (namedtuple)
  1. doc_id: str
  2. title: str
  3. condition: str
  4. summary: str
  5. detailed_description: str
  6. eligibility: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2017")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, title, condition, summary, detailed_description, eligibility>

You can find more details about the Python API here.


"clinicaltrials/2017/trec-pm-2017"

The TREC 2017 Precision Medicine clinical trials subtask.

queriesdocsqrelsCitationMetadata
30 queries

Language: en

Query type:
TrecPm2017Query: (namedtuple)
  1. query_id: str
  2. disease: str
  3. gene: str
  4. demographic: str
  5. other: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2017/trec-pm-2017")
for query in dataset.queries_iter():
    query # namedtuple<query_id, disease, gene, demographic, other>

You can find more details about the Python API here.


"clinicaltrials/2017/trec-pm-2018"

The TREC 2018 Precision Medicine clinical trials subtask.

queriesdocsqrelsCitationMetadata
50 queries

Language: en

Query type:
TrecPmQuery: (namedtuple)
  1. query_id: str
  2. disease: str
  3. gene: str
  4. demographic: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2017/trec-pm-2018")
for query in dataset.queries_iter():
    query # namedtuple<query_id, disease, gene, demographic>

You can find more details about the Python API here.


"clinicaltrials/2019"

A snapshot of ClinicalTrials.gov from May 2019 for use with the clinicaltrials/2019/trec-pm-2019 Clinical Trials subtask.

docsMetadata
306K docs

Language: en

Document type:
ClinicalTrialsDoc: (namedtuple)
  1. doc_id: str
  2. title: str
  3. condition: str
  4. summary: str
  5. detailed_description: str
  6. eligibility: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2019")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, title, condition, summary, detailed_description, eligibility>

You can find more details about the Python API here.


"clinicaltrials/2019/trec-pm-2019"

The TREC 2019 Precision Medicine clinical trials subtask.

queriesdocsqrelsCitationMetadata
40 queries

Language: en

Query type:
TrecPmQuery: (namedtuple)
  1. query_id: str
  2. disease: str
  3. gene: str
  4. demographic: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2019/trec-pm-2019")
for query in dataset.queries_iter():
    query # namedtuple<query_id, disease, gene, demographic>

You can find more details about the Python API here.


"clinicaltrials/2021"

A snapshot of ClinicalTrials.gov from April 2021 for use with the TREC Clinical Trials 2021 Track.

docsMetadata
376K docs

Language: en

Document type:
ClinicalTrialsDoc: (namedtuple)
  1. doc_id: str
  2. title: str
  3. condition: str
  4. summary: str
  5. detailed_description: str
  6. eligibility: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2021")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, title, condition, summary, detailed_description, eligibility>

You can find more details about the Python API here.


"clinicaltrials/2021/trec-ct-2021"

The TREC Clinical Trials 2021 track.

queriesdocsqrelsMetadata
75 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2021/trec-ct-2021")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"clinicaltrials/2021/trec-ct-2022"

The TREC Clinical Trials 2022 track.

queriesdocsMetadata
50 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrier
import ir_datasets
dataset = ir_datasets.load("clinicaltrials/2021/trec-ct-2022")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.