← home
Github: datasets/mmarco.py

ir_datasets: mMARCO

Index
  1. mmarco
  2. mmarco/de
  3. mmarco/de/dev
  4. mmarco/de/dev/small
  5. mmarco/de/train
  6. mmarco/es
  7. mmarco/es/dev
  8. mmarco/es/dev/small
  9. mmarco/es/train
  10. mmarco/fr
  11. mmarco/fr/dev
  12. mmarco/fr/dev/small
  13. mmarco/fr/train
  14. mmarco/id
  15. mmarco/id/dev
  16. mmarco/id/dev/small
  17. mmarco/id/train
  18. mmarco/it
  19. mmarco/it/dev
  20. mmarco/it/dev/small
  21. mmarco/it/train
  22. mmarco/pt
  23. mmarco/pt/dev
  24. mmarco/pt/dev/small
  25. mmarco/pt/dev/small/v1.1
  26. mmarco/pt/dev/v1.1
  27. mmarco/pt/train
  28. mmarco/pt/train/v1.1
  29. mmarco/ru
  30. mmarco/ru/dev
  31. mmarco/ru/dev/small
  32. mmarco/ru/train
  33. mmarco/v2/ar
  34. mmarco/v2/ar/dev
  35. mmarco/v2/ar/dev/small
  36. mmarco/v2/ar/train
  37. mmarco/v2/de
  38. mmarco/v2/de/dev
  39. mmarco/v2/de/dev/small
  40. mmarco/v2/de/train
  41. mmarco/v2/dt
  42. mmarco/v2/dt/dev
  43. mmarco/v2/dt/dev/small
  44. mmarco/v2/dt/train
  45. mmarco/v2/es
  46. mmarco/v2/es/dev
  47. mmarco/v2/es/dev/small
  48. mmarco/v2/es/train
  49. mmarco/v2/fr
  50. mmarco/v2/fr/dev
  51. mmarco/v2/fr/dev/small
  52. mmarco/v2/fr/train
  53. mmarco/v2/hi
  54. mmarco/v2/hi/dev
  55. mmarco/v2/hi/dev/small
  56. mmarco/v2/hi/train
  57. mmarco/v2/id
  58. mmarco/v2/id/dev
  59. mmarco/v2/id/dev/small
  60. mmarco/v2/id/train
  61. mmarco/v2/it
  62. mmarco/v2/it/dev
  63. mmarco/v2/it/dev/small
  64. mmarco/v2/it/train
  65. mmarco/v2/ja
  66. mmarco/v2/ja/dev
  67. mmarco/v2/ja/dev/small
  68. mmarco/v2/ja/train
  69. mmarco/v2/pt
  70. mmarco/v2/pt/dev
  71. mmarco/v2/pt/dev/small
  72. mmarco/v2/pt/train
  73. mmarco/v2/ru
  74. mmarco/v2/ru/dev
  75. mmarco/v2/ru/dev/small
  76. mmarco/v2/ru/train
  77. mmarco/v2/vi
  78. mmarco/v2/vi/dev
  79. mmarco/v2/vi/dev/small
  80. mmarco/v2/vi/train
  81. mmarco/v2/zh
  82. mmarco/v2/zh/dev
  83. mmarco/v2/zh/dev/small
  84. mmarco/v2/zh/train
  85. mmarco/zh
  86. mmarco/zh/dev
  87. mmarco/zh/dev/small
  88. mmarco/zh/dev/small/v1.1
  89. mmarco/zh/dev/v1.1
  90. mmarco/zh/train

"mmarco"

A version of the MS MARCO passage dataset (msmarco-passage) with the queries and documents automatically translated into several languages.

  • Documents: Short passages (from web), translated from English
  • Queries: Natural language questions (from query log), translated from English
  • Repository
  • Dataset Paper
Citation

ir_datasets.bib:

\cite{Bonifacio2021MMarco}

Bibtex:

@article{Bonifacio2021MMarco, title={{mMARCO}: A Multilingual Version of {MS MARCO} Passage Ranking Dataset}, author={Luiz Henrique Bonifacio and Israel Campiotti and Roberto Lotufo and Rodrigo Nogueira}, year={2021}, journal={arXiv:2108.13897} }

"mmarco/de"

Version of msmarco-passage, with documents translated into German.

docsCitationMetadata
8.8M docs

Language: de

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/de")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/de/dev"

Version of msmarco-passage/dev, with queries and documents translated into German.

queriesdocsqrelsCitationMetadata
101K queries

Language: de

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/de/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/de/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into German.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: de

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/de/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/de/train"

Version of msmarco-passage/train, with queries and documents translated into German.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: de

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/de/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/es"

Version of msmarco-passage, with documents translated into Spanish.

docsCitationMetadata
8.8M docs

Language: es

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/es")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/es/dev"

Version of msmarco-passage/dev, with queries and documents translated into Spanish.

queriesdocsqrelsCitationMetadata
101K queries

Language: es

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/es/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/es/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Spanish.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: es

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/es/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/es/train"

Version of msmarco-passage/train, with queries and documents translated into Spanish.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: es

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/es/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/fr"

Version of msmarco-passage, with documents translated into French.

docsCitationMetadata
8.8M docs

Language: fr

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/fr")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/fr/dev"

Version of msmarco-passage/dev, with queries and documents translated into French.

queriesdocsqrelsCitationMetadata
101K queries

Language: fr

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/fr/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/fr/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into French.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: fr

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/fr/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/fr/train"

Version of msmarco-passage/train, with queries and documents translated into French.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: fr

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/fr/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/id"

Version of msmarco-passage, with documents translated into Indonesian.

docsCitationMetadata
8.8M docs

Language: id

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/id")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/id/dev"

Version of msmarco-passage/dev, with queries and documents translated into Indonesian.

queriesdocsqrelsCitationMetadata
101K queries

Language: id

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/id/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/id/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Indonesian.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: id

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/id/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/id/train"

Version of msmarco-passage/train, with queries and documents translated into Indonesian.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: id

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/id/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/it"

Version of msmarco-passage, with documents translated into Italian.

docsCitationMetadata
8.8M docs

Language: it

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/it")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/it/dev"

Version of msmarco-passage/dev, with queries and documents translated into Italian.

queriesdocsqrelsCitationMetadata
101K queries

Language: it

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/it/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/it/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Italian.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: it

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/it/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/it/train"

Version of msmarco-passage/train, with queries and documents translated into Italian.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: it

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/it/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/pt"

Version of msmarco-passage, with documents translated into Portuguese.

docsCitationMetadata
8.8M docs

Language: pt

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/pt/dev"

Version of msmarco-passage/dev, with queries and documents translated into Portuguese.

queriesdocsqrelsCitationMetadata
102K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/pt/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Portuguese.

queriesdocsqrelsCitationMetadata
7.0K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/pt/dev/small/v1.1"

Version of msmarco-passage/dev, with queries and documents translated into Portuguese.

Version 1.1 of this file includes manual corrections from the authorss of the translated files. See discussion here. It also removes some duplicated query IDs.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/dev/small/v1.1")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/pt/dev/v1.1"

Version of msmarco-passage/dev, with queries and documents translated into Portuguese.

Version 1.1 of this file includes manual corrections from the authorss of the translated files. See discussion here. It also removes some duplicated query IDs.

queriesdocsqrelsCitationMetadata
101K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/dev/v1.1")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/pt/train"

Version of msmarco-passage/train, with queries and documents translated into Portuguese.

queriesdocsqrelsdocpairsCitationMetadata
812K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/pt/train/v1.1"

Version of msmarco-passage/train, with queries and documents translated into Portuguese.

Version 1.1 of this file includes manual corrections from the authorss of the translated files. See discussion here. It also removes some duplicated query IDs.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/pt/train/v1.1")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/ru"

Version of msmarco-passage, with documents translated into Russian.

docsCitationMetadata
8.8M docs

Language: ru

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/ru")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/ru/dev"

Version of msmarco-passage/dev, with queries and documents translated into Russian.

queriesdocsqrelsCitationMetadata
101K queries

Language: ru

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/ru/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/ru/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Russian.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: ru

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/ru/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/ru/train"

Version of msmarco-passage/train, with queries and documents translated into Russian.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: ru

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/ru/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ar"

Version of msmarco-passage, with queries and documents translated into Arabic.

docsCitationMetadata
8.8M docs

Language: ar

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ar")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/ar/dev"

Version of msmarco-passage/dev, with queries and documents translated into Arabic.

queriesdocsqrelsCitationMetadata
101K queries

Language: ar

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ar/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ar/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Arabic.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: ar

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ar/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ar/train"

Version of msmarco-passage/train, with queries and documents translated into Arabic.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: ar

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ar/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/de"

Version of msmarco-passage, with queries and documents translated into German.

docsCitationMetadata
8.8M docs

Language: de

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/de")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/de/dev"

Version of msmarco-passage/dev, with queries and documents translated into German.

queriesdocsqrelsCitationMetadata
101K queries

Language: de

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/de/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/de/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into German.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: de

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/de/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/de/train"

Version of msmarco-passage/train, with queries and documents translated into German.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: de

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/de/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/dt"

Version of msmarco-passage, with queries and documents translated into Dutch.

docsCitationMetadata
8.8M docs

Language: dt

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/dt")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/dt/dev"

Version of msmarco-passage/dev, with queries and documents translated into Dutch.

queriesdocsqrelsCitationMetadata
101K queries

Language: dt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/dt/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/dt/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Dutch.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: dt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/dt/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/dt/train"

Version of msmarco-passage/train, with queries and documents translated into Dutch.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: dt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/dt/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/es"

Version of msmarco-passage, with queries and documents translated into Spanish.

docsCitationMetadata
8.8M docs

Language: es

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/es")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/es/dev"

Version of msmarco-passage/dev, with queries and documents translated into Spanish.

queriesdocsqrelsCitationMetadata
101K queries

Language: es

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/es/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/es/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Spanish.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: es

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/es/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/es/train"

Version of msmarco-passage/train, with queries and documents translated into Spanish.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: es

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/es/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/fr"

Version of msmarco-passage, with queries and documents translated into French.

docsCitationMetadata
8.8M docs

Language: fr

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/fr")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/fr/dev"

Version of msmarco-passage/dev, with queries and documents translated into French.

queriesdocsqrelsCitationMetadata
101K queries

Language: fr

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/fr/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/fr/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into French.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: fr

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/fr/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/fr/train"

Version of msmarco-passage/train, with queries and documents translated into French.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: fr

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/fr/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/hi"

Version of msmarco-passage, with queries and documents translated into Hindi.

docsCitationMetadata
8.8M docs

Language: hi

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/hi")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/hi/dev"

Version of msmarco-passage/dev, with queries and documents translated into Hindi.

queriesdocsqrelsCitationMetadata
101K queries

Language: hi

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/hi/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/hi/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Hindi.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: hi

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/hi/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/hi/train"

Version of msmarco-passage/train, with queries and documents translated into Hindi.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: hi

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/hi/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/id"

Version of msmarco-passage, with queries and documents translated into Indonesian.

docsCitationMetadata
8.8M docs

Language: id

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/id")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/id/dev"

Version of msmarco-passage/dev, with queries and documents translated into Indonesian.

queriesdocsqrelsCitationMetadata
101K queries

Language: id

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/id/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/id/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Indonesian.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: id

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/id/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/id/train"

Version of msmarco-passage/train, with queries and documents translated into Indonesian.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: id

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/id/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/it"

Version of msmarco-passage, with queries and documents translated into Italian.

docsCitationMetadata
8.8M docs

Language: it

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/it")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/it/dev"

Version of msmarco-passage/dev, with queries and documents translated into Italian.

queriesdocsqrelsCitationMetadata
101K queries

Language: it

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/it/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/it/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Italian.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: it

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/it/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/it/train"

Version of msmarco-passage/train, with queries and documents translated into Italian.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: it

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/it/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ja"

Version of msmarco-passage, with queries and documents translated into Japanese.

docsCitationMetadata
8.8M docs

Language: ja

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ja")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/ja/dev"

Version of msmarco-passage/dev, with queries and documents translated into Japanese.

queriesdocsqrelsCitationMetadata
101K queries

Language: ja

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ja/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ja/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Japanese.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: ja

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ja/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ja/train"

Version of msmarco-passage/train, with queries and documents translated into Japanese.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: ja

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ja/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/pt"

Version of msmarco-passage, with queries and documents translated into Portuguese.

docsCitationMetadata
8.8M docs

Language: pt

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/pt")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/pt/dev"

Version of msmarco-passage/dev, with queries and documents translated into Portuguese.

queriesdocsqrelsCitationMetadata
101K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/pt/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/pt/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Portuguese.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/pt/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/pt/train"

Version of msmarco-passage/train, with queries and documents translated into Portuguese.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: pt

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/pt/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ru"

Version of msmarco-passage, with queries and documents translated into Russian.

docsCitationMetadata
8.8M docs

Language: ru

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ru")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/ru/dev"

Version of msmarco-passage/dev, with queries and documents translated into Russian.

queriesdocsqrelsCitationMetadata
101K queries

Language: ru

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ru/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ru/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Russian.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: ru

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ru/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/ru/train"

Version of msmarco-passage/train, with queries and documents translated into Russian.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: ru

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/ru/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/vi"

Version of msmarco-passage, with queries and documents translated into Vietnamese.

docsCitationMetadata
8.8M docs

Language: vi

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/vi")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/vi/dev"

Version of msmarco-passage/dev, with queries and documents translated into Vietnamese.

queriesdocsqrelsCitationMetadata
101K queries

Language: vi

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/vi/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/vi/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Vietnamese.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: vi

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/vi/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/vi/train"

Version of msmarco-passage/train, with queries and documents translated into Vietnamese.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: vi

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/vi/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/zh"

Version of msmarco-passage, with queries and documents translated into Chinese.

docsCitationMetadata
8.8M docs

Language: zh

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/zh")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/v2/zh/dev"

Version of msmarco-passage/dev, with queries and documents translated into Chinese.

queriesdocsqrelsCitationMetadata
101K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/zh/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/zh/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Chinese.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/zh/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/v2/zh/train"

Version of msmarco-passage/train, with queries and documents translated into Chinese.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/v2/zh/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/zh"

Version of msmarco-passage, with documents translated into Chinese.

docsCitationMetadata
8.8M docs

Language: zh

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/zh")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"mmarco/zh/dev"

Version of msmarco-passage/dev, with queries and documents translated into Chinese.

queriesdocsqrelsCitationMetadata
101K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/dev")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/zh/dev/small"

Version of msmarco-passage/dev/small, with queries and documents translated into Chinese.

queriesdocsqrelsCitationMetadata
7.0K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/dev/small")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/zh/dev/small/v1.1"

Version of msmarco-passage/dev, with queries and documents translated into Chinese.

Version 1.1 of this file includes manual corrections from the authorss of the translated files. See discussion here.

queriesdocsqrelsscoreddocsCitationMetadata
7.0K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/dev/small/v1.1")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/zh/dev/v1.1"

Version of msmarco-passage/dev, with queries and documents translated into Chinese.

Version 1.1 of this file includes manual corrections from the authorss of the translated files. See discussion here.

queriesdocsqrelsCitationMetadata
101K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/dev/v1.1")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"mmarco/zh/train"

Version of msmarco-passage/train, with queries and documents translated into Chinese.

queriesdocsqrelsdocpairsCitationMetadata
809K queries

Language: zh

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("mmarco/zh/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.