ir_datasets
: NFCorpus (NutritionFacts)"NFCorpus is a full-text English retrieval data set for Medical Information Retrieval. It contains a total of 3,244 natural language queries (written in non-technical English, harvested from the NutritionFacts.org site) with 169,756 automatically extracted relevance judgments for 9,964 medical documents (written in a complex terminology-heavy language), mostly from PubMed."
Official dev set. Queries include both title and combinted "all" text field (titles, descriptions, topics, transcripts and comments)
Official dev set, filtered to exclude queries from topic pages.
Official dev set, filtered to only include queries from video pages.
Official test set. Queries include both title and combinted "all" text field (titles, descriptions, topics, transcripts and comments)
Official test set, filtered to exclude queries from topic pages.
Official test set, filtered to only include queries from video pages.
Official train set. Queries include both title and combinted "all" text field (titles, descriptions, topics, transcripts and comments)
Official train set, filtered to exclude queries from topic pages.
Official train set, filtered to only include queries from video pages.