Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions

Citation Author(s):
Nirmalya
Thakur
Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, USA
Isabella
Hall
Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, USA
Chia Y
Han
Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, OH 45221-0030, USA
Submitted by:
Nirmalya Thakur
Last updated:
Sat, 10/22/2022 - 22:40
DOI:
10.21227/r5mv-ax79
Data Format:
Research Article Link:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Please cite the following paper when using this dataset:

N. Thakur, "Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets from 2017–2022 and 100 Research Questions", Journal of Analytics, Volume 1, Issue 2, 2022, pp. 72-97, DOI: https://doi.org/10.3390/analytics1020007

Abstract

The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and diverse use cases in assisted living, military, healthcare, firefighting, and industry 4.0. The exoskeleton market is projected to increase by multiple times its current value within the next two years. Therefore, it is crucial to study the degree and trends of user interest, views, opinions, perspectives, attitudes, acceptance, feedback, engagement, buying behavior, and satisfaction, towards exoskeletons, for which the availability of Big Data of conversations about exoskeletons is necessary. The Internet of Everything style of today’s living, characterized by people spending more time on the internet than ever before, with a specific focus on social media platforms, holds the potential for the development of such a dataset by the mining of relevant social media conversations. Twitter, one such social media platform, is highly popular amongst all age groups, where the topics found in the conversation paradigms include emerging technologies such as exoskeletons. To address this research challenge, this work makes two scientific contributions to this field. First, it presents an open-access dataset of about 140,000 Tweets about exoskeletons that were posted in a 5-year period from 21 May 2017 to 21 May 2022. Second, based on a comprehensive review of the recent works in the fields of Big Data, Natural Language Processing, Information Retrieval, Data Mining, Pattern Recognition, and Artificial Intelligence that may be applied to relevant Twitter data for advancing research, innovation, and discovery in the field of exoskeleton research, a total of 100 Research Questions are presented for researchers to study, analyze, evaluate, ideate, and investigate based on this dataset.

Instructions: 

This dataset contains about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons.

 

The dataset contains only tweet identifiers (Tweet IDs) due to the terms and conditions of Twitter to re-distribute Twitter data ONLY for research purposes. They need to be hydrated to be used. The process of retrieving a tweet's complete information (such as the text of the tweet, username, user ID, date and time, etc.) using its ID is known as the hydration of a tweet ID. For hydrating this dataset the Hydrator application (link to download and a step-by-step tutorial on how to use Hydrator) may be used.

 

Data Description

This dataset consists of 7 .txt files. The following shows the number of Tweet IDs and the date range (of the associated tweets) in each of these files. 

Filename: Exoskeleton_TweetIDs_Set1.txt

Number of Tweet IDs – 22945, Date Range of Tweets - July 20, 2021 – May 21, 2022

Filename: Exoskeleton_TweetIDs_Set2.txt

Number of Tweet IDs – 19416, Date Range of Tweets - Dec 1, 2020 – July 19, 2021

Filename: Exoskeleton_TweetIDs_Set3.txt

Number of Tweet IDs – 16673, Date Range of Tweets - April 29, 2020 - Nov 30, 2020

Filename: Exoskeleton_TweetIDs_Set4.txt

Number of Tweet IDs – 16208, Date Range of Tweets - Oct 5, 2019 - Apr 28, 2020

Filename: Exoskeleton_TweetIDs_Set5.txt

Number of Tweet IDs – 17983, Date Range of Tweets - Feb 13, 2019 - Oct 4, 2019

Filename: Exoskeleton_TweetIDs_Set6.txt

Number of Tweet IDs – 34009, Date Range of Tweets - Nov 9, 2017 - Feb 12, 2019

Filename: Exoskeleton_TweetIDs_Set7.txt

Number of Tweet IDs – 11351, Date Range of Tweets - May 21, 2017 - Nov 8, 2017

 

For any questions related to the dataset, please contact Nirmalya Thakur at thakurna@mail.uc.edu

Comments

Ok

Submitted by Sophia Smith on Sun, 04/16/2023 - 06:47

Dataset Files

LOGIN TO ACCESS DATASET FILES
Open Access dataset files are accessible to all logged in  users. Don't have a login?  Create a free IEEE account.  IEEE Membership is not required.