ir_datasets
: neuMARCOA version of msmarco-passage for cross-language information retrieval, provided by JHU HLTCOE with documents translated to other langauges using a Sockeye 2 translation model.
The msmarco-passage corpus, translated to Persian (Farsi).
Language: fa
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/fa")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev, with the corpus translated to Persian (Farsi).
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/fa/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev/judged, with the corpus translated to Persian (Farsi).
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/fa/dev/judged")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev/small, with the corpus translated to Persian (Farsi).
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/fa/dev/small")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/train, with the corpus translated to Persian (Farsi).
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/fa/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/train/judged, with the corpus translated to Persian (Farsi).
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/fa/train/judged")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
The msmarco-passage corpus, translated to Russian.
Language: ru
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/ru")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev, with the corpus translated to Russian.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/ru/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev/judged, with the corpus translated to Russian.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/ru/dev/judged")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev/small, with the corpus translated to Russian.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/ru/dev/small")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/train, with the corpus translated to Russian.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/ru/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/train/judged, with the corpus translated to Russian.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/ru/train/judged")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
The msmarco-passage corpus, translated to Chinese.
Language: zh
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/zh")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev, with the corpus translated to Chinese.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/zh/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev/judged, with the corpus translated to Chinese.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/zh/dev/judged")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/dev/small, with the corpus translated to Chinese.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/zh/dev/small")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/train, with the corpus translated to Chinese.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/zh/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
A version of msmarco-passage/train/judged, with the corpus translated to Chinese.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("neumarco/zh/train/judged")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.