← home
Github: datasets/lotte.py

ir_datasets: LoTTE

Index
  1. lotte
  2. lotte/lifestyle/dev
  3. lotte/lifestyle/dev/forum
  4. lotte/lifestyle/dev/search
  5. lotte/lifestyle/test
  6. lotte/lifestyle/test/forum
  7. lotte/lifestyle/test/search
  8. lotte/pooled/dev
  9. lotte/pooled/dev/forum
  10. lotte/pooled/dev/search
  11. lotte/pooled/test
  12. lotte/pooled/test/forum
  13. lotte/pooled/test/search
  14. lotte/recreation/dev
  15. lotte/recreation/dev/forum
  16. lotte/recreation/dev/search
  17. lotte/recreation/test
  18. lotte/recreation/test/forum
  19. lotte/recreation/test/search
  20. lotte/science/dev
  21. lotte/science/dev/forum
  22. lotte/science/dev/search
  23. lotte/science/test
  24. lotte/science/test/forum
  25. lotte/science/test/search
  26. lotte/technology/dev
  27. lotte/technology/dev/forum
  28. lotte/technology/dev/search
  29. lotte/technology/test
  30. lotte/technology/test/forum
  31. lotte/technology/test/search
  32. lotte/writing/dev
  33. lotte/writing/dev/forum
  34. lotte/writing/dev/search
  35. lotte/writing/test
  36. lotte/writing/test/forum
  37. lotte/writing/test/search

"lotte"

LoTTE (Long-Tail Topic-stratified Evaluation) is a set of test collections focused on out-of-domain evaluation. It consists of data from several StackExchanges, with relevance assumed by either by upvotes (at least 1) or being selected as the accepted answer by the question's author.

Note that the dev and test corpora are disjoint to avoid leakage.

  • Documents: Answers to StackExchange questions
  • Queries: Natural language questions
  • Dataset Paper
Citation

ir_datasets.bib:

\cite{Santhanam2021ColBERTv2}

Bibtex:

@article{Santhanam2021ColBERTv2, title = "ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction", author = "Keshav Santhanam and Omar Khattab and Jon Saad-Falcon and Christopher Potts and Matei Zaharia", journal= "arXiv preprint arXiv:2112.01488", year = "2021", url = "https://arxiv.org/abs/2112.01488" }

"lotte/lifestyle/dev"

Answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel.

docsCitationMetadata
269K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/dev")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/lifestyle/dev/forum"

Forum queries for lotte/lifestyle/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.1K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/dev/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/lifestyle/dev/search"

Search queries for lotte/lifestyle/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
417 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/dev/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/lifestyle/test"

Queries and answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel.

docsCitationMetadata
119K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/lifestyle/test/forum"

Forum queries for lotte/lifestyle/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/test/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/lifestyle/test/search"

Search queries for lotte/lifestyle/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
661 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/test/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/pooled/dev"

docsCitationMetadata
2.4M docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/dev")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/pooled/dev/forum"

Forum queries for lotte/pooled/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
10K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/dev/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/pooled/dev/search"

Search queries for lotte/pooled/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.9K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/dev/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/pooled/test"

docsCitationMetadata
2.8M docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/pooled/test/forum"

Forum queries for lotte/pooled/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
10K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/test/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/pooled/test/search"

Search queries for lotte/pooled/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
3.9K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/test/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/recreation/dev"

Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi.

docsCitationMetadata
263K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/dev")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/recreation/dev/forum"

Forum queries for lotte/recreation/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/dev/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/recreation/dev/search"

Search queries for lotte/recreation/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
563 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/dev/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/recreation/test"

Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi.

docsCitationMetadata
167K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/recreation/test/forum"

Forum queries for lotte/recreation/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/test/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/recreation/test/search"

Search queries for lotte/recreation/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
924 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/test/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/science/dev"

Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats.

docsCitationMetadata
344K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/science/dev")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/science/dev/forum"

Forum queries for lotte/science/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/science/dev/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/science/dev/search"

Search queries for lotte/science/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
538 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/science/dev/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/science/test"

Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats.

docsCitationMetadata
1.7M docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/science/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/science/test/forum"

Forum queries for lotte/science/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/science/test/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/science/test/search"

Search queries for lotte/science/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
617 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/science/test/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/technology/dev"

Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps.

docsCitationMetadata
1.3M docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/technology/dev")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/technology/dev/forum"

Forum queries for lotte/technology/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/technology/dev/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/technology/dev/search"

Search queries for lotte/technology/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
916 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/technology/dev/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/technology/test"

Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps.

docsCitationMetadata
639K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/technology/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/technology/test/forum"

Forum queries for lotte/technology/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/technology/test/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/technology/test/search"

Search queries for lotte/technology/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
596 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/technology/test/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/writing/dev"

Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing.

docsCitationMetadata
277K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/writing/dev")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/writing/dev/forum"

Forum queries for lotte/writing/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/writing/dev/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/writing/dev/search"

Search queries for lotte/writing/dev.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
497 queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/writing/dev/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/writing/test"

Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing.

docsCitationMetadata
200K docs

Language: en

Document type:
GenericDoc: (namedtuple)
  1. doc_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/writing/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, text>

You can find more details about the Python API here.


"lotte/writing/test/forum"

Forum queries for lotte/writing/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
2.0K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/writing/test/forum")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.


"lotte/writing/test/search"

Search queries for lotte/writing/test.

Official evaluation measures: Success@5

queriesdocsqrelsCitationMetadata
1.1K queries

Language: en

Query type:
GenericQuery: (namedtuple)
  1. query_id: str
  2. text: str

Examples:

Python APICLIPyTerrierXPM-IR
import ir_datasets
dataset = ir_datasets.load("lotte/writing/test/search")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>

You can find more details about the Python API here.