artificial intelligence; machine learning; natural language processing; sentiment analysis; named entity recognition; relation extraction
Using Python. we crawl a total of 18, 793 diabetes related Q&A between Jun. 1, 2016 and Sept. 1, 2020 on, a famous Chinese Online Medical Community. Each data contains four parts of the question detail page: Title, Problem Description, User ID and Question Time, and three parts of the doctor’s answer page: Doctor ID, Answer Content and Answer Time. After preprocessing such as cleaning and deduplication, we finally obtain 18,521 valid data.
- Categories: