Online Learning Global Queries Dataset: A Comprehensive Dataset of What People from Different Countries ask Google about Online Learning

Citation Author(s):
Nirmalya
Thakur
Department of Electrical Engineering and Computer Science University of Cincinnati
Isabella
Hall
Department of Electrical Engineering and Computer Science University of Cincinnati
Chia Y
Han
Department of Electrical Engineering and Computer Science University of Cincinnati
Submitted by:
Isabella Hall
Last updated:
Mon, 11/01/2021 - 18:43
DOI:
10.21227/xbvs-0198
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Any work using this dataset should cite the following paper:

Nirmalya Thakur, Isabella Hall, and Chia Y. Han, “Investigating the Emergence of Online Learning in Different Countries using the 5 W’s and 1 H Approach”, Proceedings of the 7th International Conference on Human Interaction & Emerging Technologies: Artificial Intelligence & Future Applications (IHIET-AI 2022), Lausanne, Switzerland, April 21-23, 2022

Abstract

The rise of the Internet of Everything lifestyle in the last decade has had a significant impact on the increased emergence and adoption of online learning and education in almost all countries across the world. The COVID-19 pandemic, causing the academic, non-academic, government, and corporate sectors to switch to e-learning, has acted as a catalyst towards the growth of the online learning and education sector, which is increasing at a rate as never seen before and is expected to hit USD 1 trillion by 2027. As E-learning 3.0 proceeds towards becoming the norm in different regions on a global scale, users of different forms of online learning technologies such as educators, students, and educational institutions have started spending time, more than ever before, on the internet to familiarize themselves with the different emerging online learning-based technologies. This is resulting in the generation of enormous amounts of Big Data centered around online learning in all the search engines used across the world. This has been specifically predominant on Google as Google is the most popular search engine in almost all geographic regions on a global scale. Mining, studying, interpreting, and analyzing such web behavior data from Google, especially the specific queries, originating from different geographic regions, holds the potential for performing a wide range of research tasks related to investigating the emergence of online learning in different countries. These research tasks could include user research, user behavior analysis, topic modeling, sentiment analysis, and aspect-based sentiment analysis, just to name a few.

To address this challenge, this work presents a comprehensive dataset of all the queries related to online learning that was searched on Google by individuals from different countries of the world. As adoption of E-learning 3.0 is a crucial aspect of the economic growth and development of a country, therefore, this dataset presents the web behavior data – in the form of Google search queries related to online learning that originated from all the 38 member states of the Organization for Economic Co-operation and Development (OECD). These member states include - Austria, Australia, Belgium, Canada, Chile, Colombia, Costa Rica, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Latvia, Lithuania, Luxembourg, Mexico, the Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Slovenia, Spain, Sweden, Switzerland, Turkey, the United Kingdom, and the United States.

Data Description

The dataset consists of one MS Excel workbook named – “Online_Learning_Global_Queries.xlsx”. This workbook has 38 MS Excel sheets, with each sheet named after the specific country (a member state of OECD) whose data it represents. The data was collected on November 1, 2021. Each MS Excel Sheet in this workbook has the following attributes:

· Modifier Type: It lists the type of query. The categories include questions, propositions, comparisons, etc.

· Modifier: It lists the specific modifiers used to communicate the query to Google. The categories include the popular 5 W’s and 1 H: Who, What, When Where, Why, and How, as well as other modifiers.

· Suggestion: It lists the specific query that consists of the specific modifier and represents the associated modifier type.

· Language: It represents the language of the query. The most common value is “en” which stands for English.

· Region: It represents the 2-letter country code in ALPHA-2 format (ISO 3166).

· Keyword: It represents the online query that is being analyzed. The query for all the countries is “online learning”.

Details on the methodology and procedure that were followed for the development of this dataset are included in the above-mentioned paper. For any questions related to this dataset or the paper, please contact Nirmalya Thakur at thakurna@mail.uc.edu

Instructions: 

Refer to the abstract and data description