Metrics for Recommending Corrective and Preventive Actions (CAPAs) in Software Development Projects: a Systematic Literature Review

Citation Author(s):
Huawei, Moscow, Russia
Innopolis University, Innopolis, Russia
Innopolis University, Innopolis, Russia
Innopolis University, Innopolis, Russia
Innopolis University, Innopolis, Russia
Innopolis University, Innopolis, Russia
Innopolis University, Innopolis, Russia
University of Alberta, Edmonton, Canada
Innopolis University, Innopolis, Russia
Submitted by:
Firas Jolha
Last updated:
Sat, 03/26/2022 - 13:22
Data Format:
0 ratings - Please login to submit your rating.


Several works attempted to establish procedures to individuate bugs, defects or anomalies during the different phases of software development, especially in the implementation phase. The mere detection of anomalies is not sufficient, though, at least until they get fixed. Corrective actions can be formulated to remove anomalies and enhance the software quality. Preventive actions are equally important in as much as they avoid the emergence and recurrence of anomalies in the future. To know whether an anomaly exists in any given software, one must measure the software quality attributes related to it using specific software metrics. The main aim of this work was to find out and explain how to meaningfully attribute software metrics to useful corrective and preventive actions.

However, to determine proper actions for the specific contexts, one needs to know more about the anomalies and about their root causes. In this study, we collected three kinds of data (metrics, anomalies, actions), which helped us individuate the dimensions of the problem. We found 384 software metrics, which are used to detect 374 anomalies related to 494 corrective and preventive actions. Our findings demonstrate the need to formulate remedial strategies and build tools to automate the process of determining actions from abnormal metric values.


This dataset is part of the systematic literature review (SLR) entitled “Metrics for Recommending Corrective and Preventive Actions (CAPAs) in Software Development Projects: a Systematic Literature Review”. The dataset represents the data collected from the studies we included in our final reading log. Specifically, we collected 3 types of relevant data. These are metrics, anomalies, and actions. In order to capture the relevance and significance of the data for our research, we also specified several attributes for each data type. The attributes we collected are as follows:

Each primary study is represented by its ID.
Study title.
Year of Publication
The year in which the study was published.
How many people participated in the study.
Raw metrics
The raw metric characterising the study.
The metric/s under which we grouped similar raw metrics. This information can be found in the data file attached to the SLR.
Metrics formulas
The mathematical formula describing the metric.
Metric descriptions
The definition and description of the metric used.
Metric notes
Additional information about the symbols used in the formula or terms used in the description.
Possible Metric values
The range of values that the metric can have or the range of metric measurements.
Metric context
The contextual usage of the metric in the study.
Is the metric used for detecting Anomalies?
‘Yes’, whether the metric is used for detecting anomalies, otherwise it is ‘No’. We note here that some papers declared that specific metrics have been used for detecting anomalies, but these anomalies have not been identified. We set ‘Yes’ for this kind of data samples but next the attribute ‘Anomaly’ has not been set.
The anomaly obtained from the textual analysis of the study included in the reading log.
Anomaly Descriptions
Additional descriptive information about the anomaly in question.
Root causes
The root causes of the encountered anomaly/ies or of those triggered by actions.
Metric threshold
The value of the metric or its category (low or high), which triggered the relevant anomaly/action.
Is the anomaly handled?
‘Yes’, if the obtained anomaly is fixed, otherwise it is No’. We note here that some papers declared that specific anomalies have been fixed, but the corrective actions have not been identified in the study. We set ‘Yes’ in this attribute for this kind of data samples but the next attribute ‘Suggested Actions’ has not been set.
Suggested Actions
The action gathered from the study included in the reading log.
Action Category
This attribute specifies the type of the action obtained. The possible values for this attribute are ‘Corrective’, ‘Preventive’ and ‘Enhancement’. ‘Corrective’ actions represent the actions applied to fix or correct the anomaly, while ‘Preventive’ actions are used to avoid the recurrence of the anomalies in the future. ‘Enhancement’ actions are actions adopted to increase software quality.
Action Sources
This attribute captures the origin or source of the action obtained. The possible values for this attribute are ‘Experimental’, ‘Observational’ and ‘Industrial’. ‘Observational’ actions originated from authors’ observations, ‘Experimental’ actions were developed based on experiments conducted in labs whereas ‘Industrial’ actions were suggested based on experiments conducted by companies or in an industrial context.
Handling approaches
The approaches that are suggested and/or used for fixing the anomalies encountered.
Used ML methods
The machine learning algorithms used for detecting the anomalies.
Presence of action impact
‘Yes’, if the actions’ impact was measured and reported in the study, otherwise it was ‘No’.
Action impact metrics
The metric/s used for measuring actions’ impact. This captures how much improvement is perceived after applying the desired action.
Percentages of actions’ impact
The relative change obtained in the measurements of metrics’ impact before and after implementing the suggested action.
Funding Agency: 
Huawei Technologies Co., Ltd.