Datasets
Standard Dataset
BadSTR: Backdoor Attack on Scene Text Recognition in IoT

- Citation Author(s):
- Submitted by:
- Yi Hu
- Last updated:
- Thu, 04/17/2025 - 05:01
- DOI:
- 10.21227/e4dt-ng65
- License:
- Categories:
- Keywords:
Abstract
Recent researches have shown that non-sequential tasks based on deep neural networks (DNN), such as image classification and object detection, are vulnerable to backdoor attacks, leading to incorrect model predictions. As a crucial task in computer vision, Scene Text Recognition (STR) is widely used in IoT fields such as intelligent transportation systems and intelligent surveillance. Therefore, a high degree of security is needed to ensure the accuracy of the system for text recognition. However, there are currently no studies on STR backdoor attacks. We make the first attempt to launch backdoor attack on the STR models, and successfully embed a backdoor into STR models using Patch-Based Attack. However, the experimental results indicate that this attack method has the flaw of lacking attack robustness. To explore more robust backdoor attack on STR, we further propose BadSTR, a novel backdoor attack method that is more applicable to STR, in which we use character sequence in an image as the trigger. We perform extensive experiments on eight benchmark datasets to validate the feasibility of BadSTR. Furthermore, to evaluate the effectiveness and robustness of our proposed attack method, we introduce the mean attack success rate and the variance of attack success rate as metrics. The experimental results show that our proposed BadSTR achieves an attack success rate of over 80% in all model dataset combinations, and is more effective and robust than the Patch-Based Backdoor Attack.
The experimental results of BadSTR