Datasets
Standard Dataset
Multi-label Incremental Learning for State Prediction of Actuator Devices in Smart Homes

- Citation Author(s):
- Submitted by:
- Denis Boaventura
- Last updated:
- Wed, 03/26/2025 - 14:24
- DOI:
- 10.21227/twef-1987
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
This ZIP file contains two distinct datasets collected over a 13, 14 and 15-day periods.
- Dataset 1: This scenario involves a single resident who predominantly uses the bedroom. The room is equipped with lights, motion sensors and electrical plugs, which monitor the use of devices such as the computer. The resident's daily activities include using the computer at different times in the morning, evening and early morning, with the duration of these activities varying according to the day. In addition, he performs tasks such as brushing his teeth, preparing meals, drinking water, cleaning the room and preparing breakfast. Much of the day is spent outside the bedroom, returning to carry out activities in the evening. On weekends, the time spent in the bedroom is longer, especially for using the computer. The routine is varied, focusing on using the computer and simple household tasks, with periods of absence from the room throughout the day.
- Dataset 2: In this scenario, two residents share an apartment with six rooms, using various devices. The first resident's routine is centered on using the computer, both for work and leisure. He starts the day by waking up in Room 01, continues with activities in the bathroom and having breakfast, and then works outside the house, returning for lunch and then leaving again for work. In the evening, he performs his hygiene activities, has dinner and gets ready for bed. The second resident divides his time between listening to music, using the computer for work \textit{from office} and doing household chores. His routine begins in Room 02, followed by using the bathroom and having breakfast. Throughout the day, he alternates between using the computer in the living room and in the bedroom, listening to music at specific times and doing tasks such as washing clothes and cooking. On weekends, he usually goes out for long periods of time for walks. (This dataset was generated by the Hestia smart home simulator: https://github.com/hestia-sim/HESTIA-Smart-Home-Simulator).
- Dataset 3: This scenario is centered on just one room in the house, where three residents have distinct routines and specific preferences when using the devices in the living room, which is the main living space. The living room is equipped with a TV, air conditioning, lamps and a sound system, and each resident interacts with these devices in a personalized way. Resident A prefers to watch TV with the volume up and the room well lit, adjusting the light intensity and keeping the air conditioning at a lower temperature for his comfort. He alternates between watching TV and leaving the room briefly, with long periods of absence during the day. Resident B prefers a more moderate environment, keeping the air conditioning at a higher temperature and using automatic mode. He usually watches TV at a low volume and, at certain times, turns the TV off to listen to music at moderate volumes, adjusting the lights and the climate to create a relaxed environment. Resident C, in turn, prefers a more intense style of interaction with the devices. He keeps the air conditioning on "wind" mode and uses high volumes on both the TV and the sound system. Their routine includes watching TV, studying and taking short breaks outside the living room, but their preference for loud volumes is a defining characteristic. Each resident adjusts the living room environment according to their preferences, reflecting their individual habits and tastes. (This dataset was generated by the Hestia smart home simulator: https://github.com/hestia-sim/HESTIA-Smart-Home-Simulator).
- Dataset 04: This dataset was obtained from the Center of Advanced Studies in Adaptive Systems (CASAS) and represents a period of sensor events collected in the WSU Smart Apartment Testbed from 2010 to 2012. The apartment housed two residents at this time, and they were going about their normal daily activities. Although the same two residents lived in the apartment during this period, on weekdays between 9:00 AM and 5:00 PM, the residents would occasionally leave the apartment. This dataset does not identify the residents present in the apartment during the collection. (original dataset: https://casas.wsu.edu/datasets/kyoto.zip).
Each provided dataset refers to data collected/simulated in a home through a residential automation system. It is divided into three main files: tab_casa.csv, tab_grupos.csv, and tab_mensagens.csv.
The tab_casa.csv file is the core of the dataset. Each row represents a moment in time, with information captured by sensors and actuators in the home. The group column identifies the group present at that moment, while the userAction column indicates the user who performed the action at that time. Additionally, there are several columns with sensor- and actuator- prefixes, which provide sensor readings and actuator states, identified by hash codes. The timeStamp field records the exact moment of data collection. Finally, the file also includes derived temporal columns, such as DaySin, DayCos, dayWeek, dayWeekSin, dayWeekCos, dayMonthSin, and dayMonthCos, which are mathematical representations of the date and time intended to capture cyclical and seasonal patterns (such as daily, weekly, and monthly cycles).
The tab_grupos.csv file complements tab_casa.csv by identifying the groups mentioned in the group column of tab_casa.csv. It contains an index and a list of names, indicating the individuals associated with each group. Meanwhile, the tab_mensagens.csv file serves as a dictionary to interpret the sensors and actuators mentioned in the main file. It has an id column, which corresponds to the columns in tab_casa.csv, and a status column, which is a list of dictionaries. Each dictionary contains a code and value pair, describing the types of messages the device can send. For example, a sensor might send the message {'code': 'presence_state', 'value': 'ON'} indicating the detection of presence.