← home
Github: datasets/wikiclir.py

ir_datasets: WikiCLIR

Index
  1. wikiclir
  2. wikiclir/ar
  3. wikiclir/ca
  4. wikiclir/cs
  5. wikiclir/de
  6. wikiclir/en-simple
  7. wikiclir/es
  8. wikiclir/fi
  9. wikiclir/fr
  10. wikiclir/it
  11. wikiclir/ja
  12. wikiclir/ko
  13. wikiclir/nl
  14. wikiclir/nn
  15. wikiclir/no
  16. wikiclir/pl
  17. wikiclir/pt
  18. wikiclir/ro
  19. wikiclir/ru
  20. wikiclir/sv
  21. wikiclir/sw
  22. wikiclir/tl
  23. wikiclir/tr
  24. wikiclir/uk
  25. wikiclir/vi
  26. wikiclir/zh

"wikiclir"

A Cross-Language IR (CLIR) collection between English queries and other language documents, built from Wikipedia.

Citation

ir_datasets.bib:

\cite{sasaki-etal-2018-cross}

Bibtex:

@inproceedings{sasaki-etal-2018-cross, title = "Cross-Lingual Learning-to-Rank with Shared Representations", author = "Sasaki, Shota and Sun, Shuo and Schamoni, Shigehiko and Duh, Kevin and Inui, Kentaro", booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)", month = jun, year = "2018", address = "New Orleans, Louisiana", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N18-2073", doi = "10.18653/v1/N18-2073", pages = "458--463" }

"wikiclir/ar"

WikiCLIR with Arabic documents.

queriesdocsqrelsCitationMetadata
324K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/ar")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/ca"

WikiCLIR with Catalan documents.

queriesdocsqrelsCitationMetadata
340K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/ca")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/cs"

WikiCLIR with Czech documents.

queriesdocsqrelsCitationMetadata
234K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/cs")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/de"

WikiCLIR with German documents.

queriesdocsqrelsCitationMetadata
938K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/de")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/en-simple"

WikiCLIR with Simple English documents.

queriesdocsqrelsCitationMetadata
115K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/en-simple")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/es"

WikiCLIR with Spanish documents.

queriesdocsqrelsCitationMetadata
782K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/es")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/fi"

WikiCLIR with Finnish documents.

queriesdocsqrelsCitationMetadata
274K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/fi")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/fr"

WikiCLIR with French documents.

queriesdocsqrelsCitationMetadata
1.1M queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/fr")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/it"

WikiCLIR with Italian documents.

queriesdocsqrelsCitationMetadata
809K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/it")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/ja"

WikiCLIR with Japanese documents.

queriesdocsqrelsCitationMetadata
426K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/ja")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/ko"

WikiCLIR with Korean documents.

queriesdocsqrelsCitationMetadata
225K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/ko")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/nl"

WikiCLIR with Dutch documents.

queriesdocsqrelsCitationMetadata
688K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/nl")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/nn"

WikiCLIR with Norwegian (Bokmål) documents.

queriesdocsqrelsCitationMetadata
99K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/nn")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/no"

WikiCLIR with Norwegian (Nynorsk) documents.

queriesdocsqrelsCitationMetadata
300K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/no")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/pl"

WikiCLIR with Polish documents.

queriesdocsqrelsCitationMetadata
694K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/pl")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/pt"

WikiCLIR with Portuguese documents.

queriesdocsqrelsCitationMetadata
612K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/pt")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/ro"

WikiCLIR with Romanian documents.

queriesdocsqrelsCitationMetadata
199K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/ro")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/ru"

WikiCLIR with Russian documents.

queriesdocsqrelsCitationMetadata
665K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/ru")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/sv"

WikiCLIR with Swedish documents.

queriesdocsqrelsCitationMetadata
639K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/sv")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/sw"

WikiCLIR with Swahili documents.

queriesdocsqrelsCitationMetadata
23K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/sw")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/tl"

WikiCLIR with Tagalog documents.

queriesdocsqrelsCitationMetadata
49K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/tl")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/tr"

WikiCLIR with Turkish documents.

queriesdocsqrelsCitationMetadata
185K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/tr")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/uk"

WikiCLIR with Ukrainian documents.

queriesdocsqrelsCitationMetadata
348K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/uk")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/vi"

WikiCLIR with Vietnamese documents.

queriesdocsqrelsCitationMetadata
354K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/vi")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.


"wikiclir/zh"

WikiCLIR with Chinese documents.

queriesdocsqrelsCitationMetadata
463K queries

Language: en

Query type:
WikiClirQuery: (namedtuple)
  1. query_id: str
  2. title: str
  3. first_sent: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("wikiclir/zh")
for query in dataset.queries_iter():
    query # namedtuple<query_id, title, first_sent>

You can find more details about the Python API here.