Datasets
Open Access
Reddit Submissions of Opioid Related Content in Philadelphia Oriented Subreddits
- Citation Author(s):
- Submitted by:
- Glenn Sterner
- Last updated:
- Thu, 05/21/2020 - 12:10
- DOI:
- 10.21227/tvn3-jy53
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Reddit is one of the largest social media websites in the world and it contains valuable data about its users and their perspectives organized into virtual communities or subreddits, based on common areas of interest. Substance use issues are particularly salient within this online community due to the burgeoning substance use (opioid) crisis within the United States among other countries. A particularly important location for understanding user perceptions of opioids is the Philadelphia, Pennsylvania, USA region, due to the prevalence associated with overdose deaths. To collect user sentiment on opioid use within this region, the researchers have targeted subreddits related to Philadelphia. By referencing a predefined keyword list relating to opioids (included in the dataset), the researchers iterated through each subreddit and found all instances of the keywords. The dataset comprises submissions and comments that include the keywords. The data were collected directly from the Reddit API via the praw library in the Python programming language.
Included is the dataset in a CSV file, data dictionary for all variables (column key) in a text file, keyword list used to query the Reddit API in a text file, and the targeted subreddit list in a text file. The dataset comprises entries (submissions, comments) that had keyword query results within targeted subreddits. The dataset includes designations for submissions and comments within the data dictionary; submission denotes the first order entry within a subreddit, comment denotes entries that are posted in response to submissions or other comments. Rows include all potential entries within the targeted subreddits from January 1, 2005 – May 14, 2020.
There are 56,979 rows of data in the CSV file.
Dataset Files
- Reddit Opioid Dataset reddit_dataset.csv (24.80 MB)
- Data Dictionary data_dictionary.txt (1.89 kB)
- List of Keywords Searched keywords.txt (293 bytes)
- List of Subreddits Scraped subreddits.txt (192 bytes)
Open Access dataset files are accessible to all logged in users. Don't have a login? Create a free IEEE account. IEEE Membership is not required.