SaPGAN

Citation Author(s):
Kunhao
Li
Submitted by:
Kunhao Li
Last updated:
Mon, 02/24/2025 - 03:54
DOI:
10.21227/tk91-8x60
License:
0
0 ratings - Please login to submit your rating.

Abstract 

With the rapid advancement of large language models (LLMs), Model-as-a-Service (MaaS) has emerged as a powerful paradigm, enabling providers to deliver pre-trained models, computational resources, and database management within a unified platform.

However, the MaaS pattern raises critical data security concerns, especially regarding the risk of data leakage during transmission. Existing privacy-preserving fine-tuning approaches apply differential privacy (DP) by perturbing text embeddings before transmission. Nevertheless, these approaches rely on single noise-additions, referred to as "rigid perturbation". These mechanisms often compromise semantic integrity, resulting in suboptimal fine-tuning performance.

To address the limitation, we propose SaPGAN, the first framework that leverages a sequence-to-sequence Generator with a transformer-based Discriminator for adaptive perturbation in LLM privacy preservation. Through adversarial training, the Generator produces perturbed texts that retain high semantic coherence with the original contents. A Sampler further optimizes privacy by selecting tokens to replace, allowing the framework to effectively balance privacy protection and semantic integrity. By applying such adaptive semantically-aware perturbation, SaPGAN strikes an optimal balance between fine-tuning performance and privacy preservation. Experiments demonstrate substantial improvements in text classification and generation tasks with empirical privacy increased by up to 129.31\% at the highest utility accuracies and reduced perturbation time by up to 26.83\%.

Instructions: 

Run main.py