The Center for Education and Research in Information Assurance and Security (CERIAS)

The Center for Education and Research in
Information Assurance and Security (CERIAS)

Adversarial Examples against Distributed Machine Learning Algorithms

Research Areas: Autonomous Systems

Principal Investigator: Saurabh Bagchi

Adversarial examples (AEs) are images that can classifiers via introducing slight perturbations into original images. Recent work has shown that detecting AEs can be more effective than making the DNN robust against AEs. However, the state-of-the-art AE detection shows a high false positive rate, thereby rejecting a considerable fraction of normal images, and appears easy to bypass through reverse engineering attacks. To address this issue, we develop HAWKEYE, which is a separate classifier that analyzes the output layer of the DNN and detects AEs. Similar to prior work, HAWKEYE’s AE detector utilizes a quantized version of an input image as a reference. However, instead of merely computing a simple statistic and then thresholding to detect AEs as in prior work, we train a separate simple classifier to distinguish the variation characteristics of the difference between the DNN output on an input image and the quantized reference image. By using a classifier, the detection rate is much higher, and thus we can cascade our AE detectors that are trained for different quantization step sizes to drastically reduce positive rate, while keeping the detection rate high.

Federated machine learning is a distributed learning approach that allows a global model to be trained across multiple decentralized client devices, e.g., IoT and edge devices. This approach offers privacy, security, and economic advantages to the participating clients by allowing for training using their local data, i.e., the model parameters are computed locally by the client devices and model parameter updates are shared with a central exchange server for iterative aggregation and consequent update of a global model. However, federated learning is subject to poisoning attacks abetted by the fact that no training examples are verified by a trustworthy authority. We present a typical federated learning scenario where the clients train their own local models on disjoint sets of data and periodically upload their local models to a parameter server for aggregation and download the aggregated (global) model. We show that it is possible for a malicious client to stealthily inflict an untargeted model poisoning attack, in contrast to a more traditional data poisoning attack, to disrupt training. Then, we present our defense technique, FLAIR,that can identify and remove the malicious clients to completely revive the training process and can be tuned to achieve a zero false positive detection rate over a wide range of usable hyperparameters.

Personnel

Other PIs: David Inouye

Keywords: adversarial examples, autonomous system security, distributed machine learning