ir_datasets
: mMARCOA version of the MS MARCO passage dataset (msmarco-passage) with the queries and documents automatically translated into several languages.
Bibtex:
@article{Bonifacio2021MMarco, title={{mMARCO}: A Multilingual Version of {MS MARCO} Passage Ranking Dataset}, author={Luiz Henrique Bonifacio and Israel Campiotti and Roberto Lotufo and Rodrigo Nogueira}, year={2021}, journal={arXiv:2108.13897} }Version of msmarco-passage, with documents translated into German.
Language: de
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/de")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into German.
Language: de
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/de/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into German.
Language: de
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/de/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into Spanish.
Language: es
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/es")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into Spanish.
Language: es
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/es/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into Spanish.
Language: es
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/es/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into French.
Language: fr
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/fr")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into French.
Language: fr
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/fr/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into French.
Language: fr
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/fr/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into Indonesian.
Language: id
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/id")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into Indonesian.
Language: id
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/id/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into Indonesian.
Language: id
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/id/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into Italian.
Language: it
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/it")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into Italian.
Language: it
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/it/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into Italian.
Language: it
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/it/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into Portuguese.
Language: pt
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/pt")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into Portuguese.
Language: pt
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into Portuguese.
Language: pt
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into Russian.
Language: ru
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/ru")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into Russian.
Language: ru
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/ru/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into Russian.
Language: ru
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/ru/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage, with documents translated into Chinese.
Language: zh
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/zh")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/dev, with queries and documents translated into Chinese.
Language: zh
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/dev")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Version of msmarco-passage/train, with queries and documents translated into Chinese.
Language: zh
Examples:
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.