Online Vertical Federated Learning Based Commercial Modular Aero- Propulsion System Simulation

Citation Author(s):
Heqiang
Wang
Submitted by:
heqiang wang
Last updated:
Sun, 01/12/2025 - 21:28
DOI:
10.21227/zqt7-yw86
License:
0
0 ratings - Please login to submit your rating.

Abstract 

With the continuous improvement in the computational capabilities of edge devices such as intelligent sensors in the Industrial Internet of Things, these sensors are no longer limited to mere data collection but are increasingly capable of performing complex computational tasks. This advancement provides both the motivation and the foundation for adopting distributed learning approaches. This study focuses on an industrial assembly line scenario where multiple sensors, distributed across various locations, sequentially collect real-time data characterized by distinct feature spaces. To leverage the computational potential of these sensors while addressing the challenges of communication overhead and privacy concerns inherent in centralized learning, we propose the \underline{D}enoising and \underline{A}daptive \underline{O}nline Vertical Federated Learning (DAO-VFL) algorithm. Tailored to the industrial assembly line scenario, DAO-VFL effectively manages continuous data streams and adapts to shifting learning objectives. Furthermore, it can address critical challenges prevalent in industrial environment, such as communication noise and heterogeneity of sensor capabilities. To support the proposed algorithm, we provide a comprehensive theoretical analysis, highlighting the effects of noise reduction and adaptive local iteration decisions on the regret bound. Experimental results on two real-world datasets further demonstrate the superior performance of DAO-VFL compared to benchmarks algorithms.

Instructions: 

The C-MAPSS (Commercial Modular Aero-Propulsion System Simulation) dataset \cite{saxena2008damage}, created by NASA, is extensively utilized in research on Remaining Useful Life (RUL) prediction, particularly in the field of aerospace engineering for prognostics. This dataset models the degradation processes in aircraft turbofan engines under a range of operational and fault conditions. It contains four subsets, each varying in the number of operating and fault conditions, with each subset divided into training and test sets. For our experiments, we use the FD002 subset, consisting of 50,119 data samples for training and 30,365 data samples for testing. Each row in the dataset provides a snapshot from a single operating cycle and contains 27 columns: the first column indicates engine ID, the second the current operational cycle number, columns 3-5 represent three operational settings influencing engine performance, columns 6-26 capture readings from 21 sensors, and the 27th column shows the actual RUL. In this setup, we assume the data is collected among 2 sensors.