NCBI

Citation Author(s):
Weixin
Li
Submitted by:
Hong Wang
Last updated:
Sat, 04/15/2023 - 04:38
DOI:
10.21227/10jb-1t90
License:
73 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

NCBI疾病语料库(Do˘gan et al,2014)由793篇PubMed摘要组成,这些摘要在提及和概念层面对疾病提及进行了充分注释。NCBI疾病语料库的公开发布包含6892个疾病提及,这些提及被映射到790个独特的疾病概念。其中,88%链接到MeSH标识符,而其余包含OMIM标识符。91%的提及与单一疾病概念有关,而其余的则被描述为概念的组合

TRANSLATE with x EnglishArabicHebrewPolishBulgarianHindiPortugueseCatalanHmong DawRomanianChinese SimplifiedHungarianRussianChinese TraditionalIndonesianSlovakCzechItalianSlovenianDanishJapaneseSpanishDutchKlingonSwedishEnglishKoreanThaiEstonianLatvianTurkishFinnishLithuanianUkrainianFrenchMalayUrduGermanMalteseVietnameseGreekNorwegianWelshHaitian CreolePersian //  TRANSLATE with COPY THE URL BELOW Back EMBED THE SNIPPET BELOW IN YOUR SITE Enable collaborative features and customize widget: Bing Webmaster PortalBack// ORIGINAL: "; langMenu.appendChild(origLangDiv); LanguageMenu.Init('LanguageMenu', LanguageMenu_keys, LanguageMenu_values, LanguageMenu_callback, LanguageMenu_popupid); window["LanguageMenu"] = LanguageMenu; clearInterval(intervalId); } }, 1);
// ]]>

Instructions: 

NCBI疾病语料库以TAB分离的对峙格式提供,标准分为训练、开发和测试子集http://www.ncbi.nlm.nih.gov/CBBresearch/Dogan/DISEASE/。我们使用以下工具将语料库注释转换为CoNLL格式:https://github.com/spyysalo/ncbi-disease.转换后的注释数量为原始数量的99.84%,原始注释中99.81%的字符串与转换后的数据匹配。差异主要是由于源数据中存在重复的文档。

TRANSLATE with x EnglishArabicHebrewPolishBulgarianHindiPortugueseCatalanHmong DawRomanianChinese SimplifiedHungarianRussianChinese TraditionalIndonesianSlovakCzechItalianSlovenianDanishJapaneseSpanishDutchKlingonSwedishEnglishKoreanThaiEstonianLatvianTurkishFinnishLithuanianUkrainianFrenchMalayUrduGermanMalteseVietnameseGreekNorwegianWelshHaitian CreolePersian //  TRANSLATE with COPY THE URL BELOW Back EMBED THE SNIPPET BELOW IN YOUR SITE Enable collaborative features and customize widget: Bing Webmaster PortalBack// ORIGINAL: "; langMenu.appendChild(origLangDiv); LanguageMenu.Init('LanguageMenu', LanguageMenu_keys, LanguageMenu_values, LanguageMenu_callback, LanguageMenu_popupid); window["LanguageMenu"] = LanguageMenu; clearInterval(intervalId); } }, 1);
// ]]>

Dataset Files

    Files have not been uploaded for this dataset