Search Interests related to Disease X originating from different Geographic Regions

Citation Author(s):
Emory University
Kesha A.
Emory University
University of Cincinnati
Yuvraj Nihal
Emory University
Emory University
Submitted by:
Nirmalya Thakur
Last updated:
Mon, 08/28/2023 - 01:12
Data Format:
0 ratings - Please login to submit your rating.


Please cite the following paper when using this dataset:

N. Thakur, K. A. Patel, I. Hall, Y. N. Duggal, and S. Cui, “A Dataset of Search Interests related to Disease X originating from different Geographic Regions”, Preprints 2023, 2023081701, DOI:


Disease X is a placeholder name that was adopted by the World Health Organization (WHO) in February 2018 on their shortlist of blueprint priority diseases to represent a hypothetical, unknown pathogen that could cause a future epidemic [1]. The WHO adopted the placeholder name to ensure that their planning was sufficiently flexible to adapt to an unknown pathogen (e.g., broader vaccines and manufacturing facilities). The Director of the US National Institute of Allergy and Infectious Diseases at that time, Anthony Fauci, stated that the concept of Disease X would encourage WHO projects to focus their research efforts on entire classes of viruses (e.g., flaviviruses), instead of just individual strains (e.g., Zika virus), thus improving WHO's capability to respond to unforeseen strains [2,3]. 

Google Trends data has been of significant interest to healthcare researchers [4], and recent studies have shown that Google Trends data can be studied and interpreted to understand different characteristics of web behavior of the general public related to different virus outbreaks such as Influenza [5], Lyme Disease [6], various tropical diseases in India [7], Syphilis [8], HIV [9], and Zika virus [10].

Therefore, this dataset presents the search interests related to Disease X (as a topic) originating from 94 geographic regions between February 2018 to August 2023 on a monthly basis. These 94 regions were selected for data mining as all these regions recorded a significant level of search interest related to Disease X (as a topic) during this timeframe. This dataset also complies with the FAIR (Findability, Accessibility, Interoperability, and Reusability) principles of Scientific Data Management [11]. 


[1]“Prioritizing diseases for research and development in emergency contexts,” [Online]. Available: [Accessed: 18-Aug-2023].
[2]P. Daszak, “Opinion,” 27-Feb-2020. [Online]. Available: [Accessed: 18-Aug-2023].
[3]T. Barnes, “World Health Organisation fears new ‘Disease X’ could cause a global pandemic,” 11-Mar-2018. [Online]. Available: [Accessed: 18-Aug-2023].
[4]S. V. Nuti et al., “The use of Google trends in health care research: A systematic review,” PLoS One, vol. 9, no. 10, p. e109583, 2014.
[5]J. Ginsberg, M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski, and L. Brilliant, “Detecting influenza epidemics using search engine query data,” Nature, vol. 457, no. 7232, pp. 1012–1014, 2009.
[6]M. Kapitány-Fövény et al., “Can Google Trends data improve forecasting of Lyme disease incidence?,” Zoonoses Public Health, vol. 66, no. 1, pp. 101–107, 2019.
[7]M. Verma, K. Kishore, M. Kumar, A. R. Sondh, G. Aggarwal, and S. Kathirvel, “Google search trends predicting disease outbreaks: An analysis from India,” Healthc. Inform. Res., vol. 24, no. 4, p. 300, 2018.
[8]S. D. Young, E. A. Torrone, J. Urata, and S. O. Aral, “Using search engine data as a tool to predict syphilis,” Epidemiology, vol. 29, no. 4, pp. 574–578, 2018.
[9]S. D. Young and Q. Zhang, “Using search engine big data for predicting new HIV diagnoses,” PLoS One, vol. 13, no. 7, p. e0199527, 2018.
[10]S. Morsy et al., “Prediction of Zika-confirmed cases in Brazil and Colombia using Google Trends,” Epidemiol. Infect., vol. 146, no. 13, pp. 1625–1627, 2018.
[11]M. D. Wilkinson et al., “The FAIR Guiding Principles for scientific data management and stewardship,” Sci. Data, vol. 3, no. 1, pp. 1–9, 2016.


In the dataset file, the search interest related to Disease X (as a topic) for each region during the above-mentioned timeframe, is presented in a separate sheet. Please review all the Excel sheets in this dataset file to obtain the search interests related to Disease X (as a topic) that emerged from all 94 regions during this timeframe.