Sanjay K. Jha -

Sanjay K. Jha

Public Documents 4

Vulnerability Aspects Extraction and Discrepancies Detection across Heterogeneous Thr...

Lihua Wang

and 5 more

November 22, 2024

Security vulnerabilities are constantly reported and must be accurately documented for vulnerability repositories. Each vulnerability description usually includes key aspects, such as the vulnerable product, version, component, vulnerability type, root cause, impact, and attack vector. Understanding and managing these key aspects is crucial, but manually analyzing and integrating the growing number of vulnerabilities from heterogeneous databases is impractical, leading to the need for automated solutions. This study investigates the serious differences in aspect-level vulnerability information between major vulnerability databases such as NVD, IBM X-Force, ExploitDB, and Openwall. The study addresses two major challenges: improving the accuracy of extracting critical vulnerability aspects and distinguishing differences in these aspects across databases. The complexity of this task stems from the heterogeneous and often conflicting nature of data sources, coupled with the lack of effective techniques for accurate aspect extraction and discrepancy resolution. Recent research has shown that advanced natural language processing techniques, particularly large-scale language models (LLMs) such as GPT-3.5 and GPT-4, excel in handling detailed and context-rich textual data. Our approach leverages these LLMs to address the challenge of aspect-level differences in vulnerability information present in different databases. Through rigorous testing on a variety of datasets, our approach not only provides significant improvements over traditional models in extracting and distinguishing vulnerabilities more accurately but also enhances our ability to manage and integrate threat intelligence data effectively.

Privacy-Preserving Probabilistic Data Encoding for IoT Data Analysis

Zakia Zaman

and 4 more

September 07, 2023

In this research, we have proposed a novel differentially private Bloom Filter data encoding technique, that enables the use of the Deep Learning Algorithm for downstream analysis. We have used public datasets to evaluate the proposed method.

FLAP: Federated Learning with Attack and Privacy Awareness

Wanli Xue

and 5 more

May 17, 2022

Federated learning provides data privacy protection by keeping data used for clients’ machine learning training private, and only sending model parameters updates to the centralised server/aggregator. However, the federated learning framework is still vulnerable to various attacks, such as data poisoning, launched by malicious/compromised clients. Cautious clients participating in federated learning, on the other hand, employ privacy protection techniques such as differential privacy to keep their model updates safe from inference attacks launched by the centralised aggregator. An aggregator thus needs to employ techniques to differentiate between model updates from benign, malicious and cautious clients, and to mitigate the effects of updates from clients other than benign clients. To reach this goal, we propose a novel federated learning system called FLAP which is robust against attacks launched by malicious clients and privacy protections employed by cautious clients.

HaS-Net: A Heal and Select Mechanism to Securely Train DNNs against Backdoor Attacks

Hassan Ali

and 3 more

September 09, 2021

We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). However, most state-of-the-art defenses rely on the statistical sanitization of inputs or latent DNN representations to capture trojan behavior. In this paper, we first challenge the robustness of many recently reported defenses by introducing a novel variant of the targeted backdoor attack, called low-confidence backdoor attack. Low-confidence attack inserts the backdoor by assigning uniformly distributed probabilistic labels to the poisoned training samples, and is applicable to many practical scenarios such as Federated Learning and model-reuse cases. We evaluate our attack against five state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus, ULP-defense and ABS-defense, under the same threat model as assumed by the respective defenses and achieve Attack Success Rates (ASRs) of 99\%, 63.73%, 91.2%, 80% and 100%, respectively. After carefully studying the properties of the state-of-the-art attacks, including low-confidence attacks, we present HaS-Net, a mechanism to securely train DNNs against a number of backdoor attacks under the data-collection scenario. For this purpose, we use a reasonably small healing dataset, approximately 2% to 15% the size of training data, to heal the network at each iteration. We evaluate our defense for different datasets—Fashion-MNIST, CIFAR-10, Celebrity Face, Consumer Complaint and Urban Sound—and network architectures—MLPs, 2D-CNNs, 1D-CNNs—and against several attack configurations—standard backdoor attacks, invisible backdoor attacks, label-consistent attack and all-trojan backdoor attack, including their low-confidence variants. Our experiments show that HaS-Nets can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.