ir_datasets
: ANTIQUE"ANTIQUE is a non-factoid quesiton answering dataset based on the questions and answers of Yahoo! Webscope L6."
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("antique")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Official test set of the ANTIQUE dataset.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("antique/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
antique/test without a set of queries deemed by the authors of ANTIQUE to be "offensive (and noisy)."
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("antique/test/non-offensive")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Official train set of the ANTIQUE dataset.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("antique/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
antique/train without the 200 queries used by antique/train/split200-valid.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("antique/train/split200-train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A held-out subset of 200 queries from antique/train. Use in conjunction with antique/train/split200-train.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("antique/train/split200-valid")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.