ir_datasets
: LoTTELoTTE (Long-Tail Topic-stratified Evaluation) is a set of test collections focused on out-of-domain evaluation. It consists of data from several StackExchanges, with relevance assumed by either by upvotes (at least 1) or being selected as the accepted answer by the question's author.
Note that the dev and test corpora are disjoint to avoid leakage.
Bibtex:
@article{Santhanam2021ColBERTv2, title = "ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction", author = "Keshav Santhanam and Omar Khattab and Jon Saad-Falcon and Christopher Potts and Matei Zaharia", journal= "arXiv preprint arXiv:2112.01488", year = "2021", url = "https://arxiv.org/abs/2112.01488" }Answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/dev")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/lifestyle/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/dev/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/lifestyle/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/dev/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Queries and answers from lifestyle-focused forums, including bicycles, coffee, crafts, diy, gardening, lifehacks, mechanics, music, outdoors, parenting, pets, sports, and travel.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/test")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/lifestyle/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/test/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/lifestyle/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/lifestyle/test/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Combined version of lotte/lifestyle/dev, lotte/recreation/dev, lotte/science/dev, lotte/technology/dev, and lotte/writing/dev.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/dev")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/pooled/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/dev/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/pooled/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/dev/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Combined version of lotte/lifestyle/test, lotte/recreation/test, lotte/science/test, lotte/technology/test, and lotte/writing/test.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/test")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/pooled/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/test/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/pooled/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/pooled/test/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/dev")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/recreation/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/dev/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/recreation/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/dev/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from recreation-focused forums, including anime, boardgames, gaming, movies, photo, rpg, and scifi.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/test")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/recreation/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/test/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/recreation/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/recreation/test/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/science/dev")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/science/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/science/dev/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/science/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/science/dev/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from science-focused forums, including academia, astronomy, biology, chemistry, datasciene, earthscience, engineering, math, philosophy, physics, and stats.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/science/test")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/science/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/science/test/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/science/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/science/test/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/technology/dev")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/technology/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/technology/dev/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/technology/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/technology/dev/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from technology-focused forums, including android, apple, askubuntu, electronics, networkengineering, security, serverfault, softwareengineering, superuser, unix, and webapps.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/technology/test")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/technology/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/technology/test/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/technology/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/technology/test/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/writing/dev")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/writing/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/writing/dev/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/writing/dev.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/writing/dev/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Answers from writing-focused forums, including ell, english, linguistics, literature, worldbuilding, and writing.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/writing/test")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, text>
You can find more details about the Python API here.
Forum queries for lotte/writing/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/writing/test/forum")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Search queries for lotte/writing/test.
Official evaluation measures: Success@5
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("lotte/writing/test/search")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.