ir_datasets: CodeSearchNetA benchmark for semantic code search. Uses
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, repo, path, func_name, code, language>
You can find more details about the Python API here.
ir_datasets export codesearchnet docs
[doc_id]    [repo]    [path]    [func_name]    [code]    [language]
...
You can find more details about the CLI here.
No example available for PyTerrier
Official challenge set, with keyword queries and deep relevance assessments.
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/challenge")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>
You can find more details about the Python API here.
ir_datasets export codesearchnet/challenge queries
[query_id]    [text]
...
You can find more details about the CLI here.
No example available for PyTerrier
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/challenge")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, repo, path, func_name, code, language>
You can find more details about the Python API here.
ir_datasets export codesearchnet/challenge docs
[doc_id]    [repo]    [path]    [func_name]    [code]    [language]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
| Rel. | Definition | 
|---|---|
| 0 | Irrelevant | 
| 1 | Weak Match | 
| 2 | String Match | 
| 3 | Exact Match | 
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/challenge")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, note>
You can find more details about the Python API here.
ir_datasets export codesearchnet/challenge qrels --format tsv
[query_id]    [doc_id]    [relevance]    [note]
...
You can find more details about the CLI here.
No example available for PyTerrier
Official test set, using queries inferred from docstrings.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/test")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>
You can find more details about the Python API here.
ir_datasets export codesearchnet/test queries
[query_id]    [text]
...
You can find more details about the CLI here.
No example available for PyTerrier
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/test")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, repo, path, func_name, code, language>
You can find more details about the Python API here.
ir_datasets export codesearchnet/test docs
[doc_id]    [repo]    [path]    [func_name]    [code]    [language]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
| Rel. | Definition | 
|---|---|
| 1 | Matches docstring | 
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/test")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>
You can find more details about the Python API here.
ir_datasets export codesearchnet/test qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...
You can find more details about the CLI here.
No example available for PyTerrier
Official train set, using queries inferred from docstrings.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/train")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>
You can find more details about the Python API here.
ir_datasets export codesearchnet/train queries
[query_id]    [text]
...
You can find more details about the CLI here.
No example available for PyTerrier
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/train")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, repo, path, func_name, code, language>
You can find more details about the Python API here.
ir_datasets export codesearchnet/train docs
[doc_id]    [repo]    [path]    [func_name]    [code]    [language]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
| Rel. | Definition | 
|---|---|
| 1 | Matches docstring | 
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/train")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>
You can find more details about the Python API here.
ir_datasets export codesearchnet/train qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...
You can find more details about the CLI here.
No example available for PyTerrier
Official validation set, using queries inferred from docstrings.
Language: en
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/valid")
for query in dataset.queries_iter():
    query # namedtuple<query_id, text>
You can find more details about the Python API here.
ir_datasets export codesearchnet/valid queries
[query_id]    [text]
...
You can find more details about the CLI here.
No example available for PyTerrier
Language: multiple/other/unknown
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/valid")
for doc in dataset.docs_iter():
    doc # namedtuple<doc_id, repo, path, func_name, code, language>
You can find more details about the Python API here.
ir_datasets export codesearchnet/valid docs
[doc_id]    [repo]    [path]    [func_name]    [code]    [language]
...
You can find more details about the CLI here.
No example available for PyTerrier
Relevance levels
| Rel. | Definition | 
|---|---|
| 1 | Matches docstring | 
Examples:
import ir_datasets
dataset = ir_datasets.load("codesearchnet/valid")
for qrel in dataset.qrels_iter():
    qrel # namedtuple<query_id, doc_id, relevance, iteration>
You can find more details about the Python API here.
ir_datasets export codesearchnet/valid qrels --format tsv
[query_id]    [doc_id]    [relevance]    [iteration]
...
You can find more details about the CLI here.
No example available for PyTerrier