ir_datasets
: KILTKILT is a corpus used for various "knowledge intensive language tasks".
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("kilt")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, title, text, text_pieces, anchors, categories, wikidata_id, history_revid, history_timestamp, history_parentid, history_pageid, history_url>
You can find more details about the Python API here.
CODEC Entity Ranking sub-task.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("kilt/codec")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
Subset of codec that only contains topics about economics.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("kilt/codec/economics")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
Subset of codec that only contains topics about history.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("kilt/codec/history")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.
Subset of codec that only contains topics about politics.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("kilt/codec/politics")
for query in dataset.queries_iter():
query # namedtuple<query_id, query, narrative>
You can find more details about the Python API here.