Abstract

Our paper presents a novel approach to enhance vulnerability descriptions using code-based methods for predicting missing phrase-based concepts. We leverage Large Language Models (LLMs) integrated with 1-hop relationship analysis to address hallucination issues. The code processes Textual Vulnerability Descriptions (TVDs) and extracts relevant security-related concepts such as vulnerability type, root cause, and impact. Our methodology involves generating predictions, verifying these with external knowledge sources, and filtering inaccurate results. By combining data preprocessing, evidence verification, and automated concept completion, our code aims to improve the coherence and reliability of vulnerability information. This ensures developers receive accurate vulnerability insights, ultimately enhancing software security and repair processes.

Instructions:

Dataset Files

code.zip (27.29 MB)

Datasets

Standard Dataset

complete concept

Abstract

Dataset Files

QUESTIONS?