Datasets
Standard Dataset
BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social
- Citation Author(s):
- Submitted by:
- Ujun Jeong
- Last updated:
- Tue, 10/01/2024 - 23:40
- DOI:
- 10.21227/yrsy-ee91
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Decentralized social media platforms like Bluesky Social (Bluesky) have made it possible to publicly disclose some user behaviors with millisecond-level precision. Embracing Bluesky's principles of open-source and open-data, we present the first collection of the temporal dynamics of user-driven social interactions. BlueTempNet integrates multiple types of networks into a single multi-network, including user-to-user interactions (following and blocking users) and user-to-community interactions (creating and joining communities). Communities are user-formed groups in custom Feeds, where users subscribe to posts aligned with their interests. Following Bluesky's public data policy, we collect existing Bluesky Feeds, including the users who liked and generated these Feeds, and provide tools to gather users' social interactions within a date range. This data-collection strategy captures past user behaviors and supports the future data collection of user behavior.
Our networks are saved in GEXF as follows:
• graph_dimension1.gexf: Feed member interaction network saved in DiGraph object, where an edge has attributes sign and time and a node is Feed member.
• graph_dimension2.gexf: Feed creator interaction network saved in DiGraph object, where an edge has attributes sign and time, and a node is Feed creator.
• graph_dimension3.gexf: Community interaction network saved in a Graph object. Each node has a node attribute that can be a member, creator, or feed. Each edge has an edge attribute, either join or create, along with a time attribute.
• multi_graph.gexf: This is a MultiGraph object that integrates the three network dimensions. To facilitate ease of use, all undirected edges in the multigraph have been converted to bidirectional edges.
Metadata is saved in CSV as follows:
• user_metadata.csv
- Node Index (consistent across all GEXF files)
- Anonymized ID (decoded after ID request review)
- Number of Followers
- Number of Following
- Number of Posts
• feed_metadata.csv
- Node Index (consistent across all GEXF files)
- Feed URI (a unique identifier for profiles specific to the Bluesky Feed)
- Display Name of Feed
- Description of Feed
- Creator of Feed (given as Anonymized ID)
- Number of Likes on Feed
Scripts are used for collecting data from Bluesky Social.
Dataset Files
- metadata_to_share.zip (23.49 MB)
- networks_to_share.zip (211.05 MB)
- scripts_to_share.zip (4.04 kB)