Datasets
Standard Dataset
Dataset of SSIOE (Self-Supervised Indoor Occupancy Estimation for Intelligent Building Management)
- Citation Author(s):
- Submitted by:
- chingchun huang
- Last updated:
- Sun, 05/08/2022 - 23:19
- DOI:
- 10.21227/j35y-wz56
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
Our data set has 5136 records collected in 214 days. The sampling rate of the sensors is 1 hour. Each record includes the number of vehicles entering and leaving the parking lot in an hour, the CO2 concentration of every building floor at the recording time, and the power consumption of each floor in an hour.
Our data was collected in a regular office building with a building automation system (BAS) and different environmental sensors. This building belongs to a high-technology company and has strict security regulations. Employees are prohibited from carrying personal mobile devices. The identity records of employees for entering and leaving the office are also confidential and cannot be used in our research. The building has eight floors, of which five floors are offices; the other floors are the parking lot, lobby, restaurant, meeting rooms, etc. The area of a single floor is about 37,500 square meters.
Typically, the employees start to enter the company from 7 o'clock, and the occupancy approaches its peak around 9 o'clock. On the other hand, the employees start to leave the company from 17 o'clock; the occupancy reaches the bottom around 23 o'clock. Aside from the above, there are special holidays sometimes for the company; during these holidays, more than half of the employees in some departments will have a holiday, and the other employees may leave work early.
We planned to estimate both the occupancy of the entire building and the occupancy of each office based on the sensor data of BAS, including CO2 concentration, power consumption, and the number of car-in and car-out of the parking lot per hour. In order to estimate the occupancy of the entire building, we calculate the average value over all possible floors for CO2 and Power to represent the sensing features of the entire building. To evaluate the correctness of our prediction, the company additionally provides the record of entering/leaving the company of employees. We accumulate the recorded numbers by hours as the occupancy of the entire building (i.e., the number of occupants in the building within an hour), normalize them into the range [0, 1] by min-max scaling, and treat the normalized data as the ground truth occupancy ratio.
In total, our data set has 5136 records collected in 214 days. The sampling rate of the sensors is 1 hour. Each record includes the number of vehicles entering and leaving the parking lot in an hour, the CO2 concentration of every building floor at the recording time, and the power consumption of each floor in an hour. The data is divided into a training set and a testing set, where the training set contains 1344 records (i.e., 8 weeks), and the testing set contains 3792 (i.e., 22 weeks and 4 days) records.