Q&A Text in Chinese Online Medical Community

Citation Author(s):
Yushan
Deng
Man
Li
Submitted by:
Man Li
Last updated:
Mon, 07/08/2024 - 15:59
DOI:
10.21227/sr2c-k812
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Using Python. we crawl a total of 18, 793 diabetes related Q&A between Jun. 1, 2016 and Sept. 1, 2020 on xywy.com, a famous Chinese Online Medical Community. Each data contains four parts of the question detail page: TitleProblem DescriptionUser ID and Question Time, and three parts of the doctor’s answer page: Doctor IDAnswer Content and Answer Time. After preprocessing such as cleaning and deduplication, we finally obtain 18,521 valid data. Considering the Problem Description contains the background information of the doctor’s answer, we combine the two into the Answer Content, which is used as the text of the knowledge graph construction later.

Instructions: 

Using Python. we crawl a total of 18, 793 diabetes related Q&A between Jun. 1, 2016 and Sept. 1, 2020 on xywy.com, a famous Chinese Online Medical Community. Each data contains four parts of the question detail page: TitleProblem DescriptionUser ID and Question Time, and three parts of the doctor’s answer page: Doctor IDAnswer Content and Answer Time. After preprocessing such as cleaning and deduplication, we finally obtain 18,521 valid data. Considering the Problem Description contains the background information of the doctor’s answer, we combine the two into the Answer Content, which is used as the text of the knowledge graph construction later.