Lihua Wang

and 5 more

Security vulnerabilities are constantly reported and must be accurately documented for vulnerability repositories. Each vulnerability description usually includes key aspects, such as the vulnerable product, version, component, vulnerability type, root cause, impact, and attack vector. Understanding and managing these key aspects is crucial, but manually analyzing and integrating the growing number of vulnerabilities from heterogeneous databases is impractical, leading to the need for automated solutions. This study investigates the serious differences in aspect-level vulnerability information between major vulnerability databases such as NVD, IBM X-Force, ExploitDB, and Openwall. The study addresses two major challenges: improving the accuracy of extracting critical vulnerability aspects and distinguishing differences in these aspects across databases. The complexity of this task stems from the heterogeneous and often conflicting nature of data sources, coupled with the lack of effective techniques for accurate aspect extraction and discrepancy resolution. Recent research has shown that advanced natural language processing techniques, particularly large-scale language models (LLMs) such as GPT-3.5 and GPT-4, excel in handling detailed and context-rich textual data. Our approach leverages these LLMs to address the challenge of aspect-level differences in vulnerability information present in different databases. Through rigorous testing on a variety of datasets, our approach not only provides significant improvements over traditional models in extracting and distinguishing vulnerabilities more accurately but also enhances our ability to manage and integrate threat intelligence data effectively.

Hassan Ali

and 3 more

We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). However, most state-of-the-art defenses rely on the statistical sanitization of inputs or latent DNN representations to capture trojan behavior. In this paper, we first challenge the robustness of many recently reported defenses by introducing a novel variant of the targeted backdoor attack, called low-confidence backdoor attack. Low-confidence attack inserts the backdoor by assigning uniformly distributed probabilistic labels to the poisoned training samples, and is applicable to many practical scenarios such as Federated Learning and model-reuse cases. We evaluate our attack against five state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus, ULP-defense and ABS-defense, under the same threat model as assumed by the respective defenses and achieve Attack Success Rates (ASRs) of 99\%, 63.73%, 91.2%, 80% and 100%, respectively. After carefully studying the properties of the state-of-the-art attacks, including low-confidence attacks, we present HaS-Net, a mechanism to securely train DNNs against a number of backdoor attacks under the data-collection scenario. For this purpose, we use a reasonably small healing dataset, approximately 2% to 15% the size of training data, to heal the network at each iteration. We evaluate our defense for different datasets—Fashion-MNIST, CIFAR-10, Celebrity Face, Consumer Complaint and Urban Sound—and network architectures—MLPs, 2D-CNNs, 1D-CNNs—and against several attack configurations—standard backdoor attacks, invisible backdoor attacks, label-consistent attack and all-trojan backdoor attack, including their low-confidence variants. Our experiments show that HaS-Nets can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.