ir_datasets
: CodeSearchNetA benchmark for semantic code search. Uses
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet")
for doc in dataset.docs_iter():
doc # namedtuple<doc_id, repo, path, func_name, code, language>
You can find more details about the Python API here.
Official challenge set, with keyword queries and deep relevance assessments.
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/challenge")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Official test set, using queries inferred from docstrings.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/test")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Official train set, using queries inferred from docstrings.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/train")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.
Official validation set, using queries inferred from docstrings.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/valid")
for query in dataset.queries_iter():
query # namedtuple<query_id, text>
You can find more details about the Python API here.