Datasets
Standard Dataset
NCBI
- Citation Author(s):
- Submitted by:
- Hong Wang
- Last updated:
- Sat, 04/15/2023 - 04:38
- DOI:
- 10.21227/10jb-1t90
- License:
- Categories:
- Keywords:
Abstract
NCBI疾病语料库(Do˘gan et al,2014)由793篇PubMed摘要组成,这些摘要在提及和概念层面对疾病提及进行了充分注释。NCBI疾病语料库的公开发布包含6892个疾病提及,这些提及被映射到790个独特的疾病概念。其中,88%链接到MeSH标识符,而其余包含OMIM标识符。91%的提及与单一疾病概念有关,而其余的则被描述为概念的组合
TRANSLATE with x EnglishArabicHebrewPolishBulgarianHindiPortugueseCatalanHmong DawRomanianChinese SimplifiedHungarianRussianChinese TraditionalIndonesianSlovakCzechItalianSlovenianDanishJapaneseSpanishDutchKlingonSwedishEnglishKoreanThaiEstonianLatvianTurkishFinnishLithuanianUkrainianFrenchMalayUrduGermanMalteseVietnameseGreekNorwegianWelshHaitian CreolePersian // TRANSLATE with COPY THE URL BELOW Back EMBED THE SNIPPET BELOW IN YOUR SITE Enable collaborative features and customize widget: Bing Webmaster PortalBack// ORIGINAL: "; langMenu.appendChild(origLangDiv); LanguageMenu.Init('LanguageMenu', LanguageMenu_keys, LanguageMenu_values, LanguageMenu_callback, LanguageMenu_popupid); window["LanguageMenu"] = LanguageMenu; clearInterval(intervalId); } }, 1);
// ]]>
NCBI疾病语料库以TAB分离的对峙格式提供,标准分为训练、开发和测试子集http://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/。我们使用以下工具将语料库注释转换为CoNLL格式:https://github.com/spyysalo/ncbi-disease.转换后的注释数量为原始数量的99.84%,原始注释中99.81%的字符串与转换后的数据匹配。差异主要是由于源数据中存在重复的文档。
TRANSLATE with x EnglishArabicHebrewPolishBulgarianHindiPortugueseCatalanHmong DawRomanianChinese SimplifiedHungarianRussianChinese TraditionalIndonesianSlovakCzechItalianSlovenianDanishJapaneseSpanishDutchKlingonSwedishEnglishKoreanThaiEstonianLatvianTurkishFinnishLithuanianUkrainianFrenchMalayUrduGermanMalteseVietnameseGreekNorwegianWelshHaitian CreolePersian // TRANSLATE with COPY THE URL BELOW Back EMBED THE SNIPPET BELOW IN YOUR SITE Enable collaborative features and customize widget: Bing Webmaster PortalBack// ORIGINAL: "; langMenu.appendChild(origLangDiv); LanguageMenu.Init('LanguageMenu', LanguageMenu_keys, LanguageMenu_values, LanguageMenu_callback, LanguageMenu_popupid); window["LanguageMenu"] = LanguageMenu; clearInterval(intervalId); } }, 1);
// ]]>