Skip to main content

Datasets

Standard Dataset

Microcontroller Program Control Flow as Space-Filling Curves for Anomaly Detection

Citation Author(s):
Sebastian De La Cruz (Florida International University)
Submitted by:
Sebastian De La Cruz
Last updated:
DOI:
10.21227/tpbq-as56
Data Format:
No Ratings Yet

Abstract

This dataset comprises a structured collection of control flow representations derived from microcontroller program execution traces, visualized as space-filling curves. The dataset is organized into eight folders, each containing 1,000 NumPy arrays representing individual image samples. These samples are grouped into four logical categories, each corresponding to a different abstraction level of program trace data: (1) complete execution traces, (2) function-call-only traces, (3) conditional-statement-only traces, and (4) scaled and truncated function-call traces. Within each group, data is further divided into two subsets: benign program behavior and anomalous behavior, enabling supervised learning or anomaly detection research. By embedding control flow information into structured visual formats, this dataset facilitates novel applications of image-based machine learning techniques for embedded systems security and program behavior modeling.

Instructions:

The benign data is intended to be used as training data for a machine learning model while the anomalous data is intended to be used as testing data for the same model. Each folder has a specific name, corresponding to the type of benign and anomalous data inside. Benign data from one group should not be mixed with anomalous data from another group. The data is composed of Numpy arrays, requiring the Numpy Python library to load and parse it.