Towards adversarial and non-adversarial robustness of machine learning and signal processing: Fundamental limits and algorithms

Jirong Yi

doi:10.17077/etd.006230

We are currently in a century of data where massive amount of data are collected and processed every day, and machine learning plays a critical role in automatically processing the data and mining useful information from it for making decisions. Despite the wide and successful applications of machine learning in different fields, the robustness of such systems cannot always be guaranteed in the sense that small variations in a certain aspect of them can result in completely wrong decisions. This prevents the applications of machine learning systems in many safety-critical scenarios such as autonomous driving. In this thesis, we push forward the research on the robustness of machine learning systems in both theregression and the classification tasks. In classification tasks, the research focus will be on how the adversarial perturbations can affect the decision making. These classifiers, especially deep learning based classifiers have been shown to be susceptible to adversarial attacks: minor well crafted changes to the input of classifiers can dramatically alter their outputs, while being unnoticeable to humans. Our starting point is an analogy between the communication model and the classification model which offers an information-theoretic viewpoint of the robustness of classifiers. We present a simple hypothesis about a feature compression property of these artificial intelligence (AI)classifiers and give theoretical arguments to show that the feature compression property can account for the observed fragility or vulnerability of AI classifiers to small adversarial perturbations. We quantify theoretically the difference in the robustness of well-trained deep learning classification systems to adversarial perturbations and random perturbations. The feature compression hypothesis and the practice in communication lead naturally to a general class of defenses (“Trust, but Verify”) for detecting the existence of adversarial input perturbations, and we further show theoretical arguments for the detection performance of the proposed method. We conduct experiments with a speech recognition task and an image recognition task, and our results demonstrate the effectiveness of the proposed detection methods. They also show that adversarial perturbation can cause large decrease in the mutual information between the estimate of the corresponding adversarial sample and its predicted candidate label. The experimental results from Trust, but Verify (TbV) imply that an adversarial can foola classification system by simply adding adversarial perturbation to its input to reduce the mutual information between the adversarial sample and its true label, which motivates us to study adversarial attacks from an information theoretical viewpoint. We consider the problem of designing optimal adversarial attacks on decision systems that maximally degrade the achievable performance of the system as measured by the mutual information between the degraded input signal and the label of interest. We establish conditions for characterizing the optimality of adversarial attacks. Our theoretical results hold for all machine learning systems, which provides theoretical arguments for the transferability of adversarial samples over different deep learning classifiers. We derive the optimal adversarial attacks for discrete and continuous signals of interest, and we also show that it is much harder to achieve adversarial attacks for minimizing mutual information when we use multiple redundant copies of the input signal. It supports the “feature compression” hypothesis as a potential explanation for the adversarial vulnerability of deep learning classifiers. We also present results from computational experiments to illustrate our theoretical results. Our results show that the design of the optimal adversarial attacks can be independent ofthe specific classification models, and the optimal adversarial attack designed in our way for a particular classification system can also fool other classifiers. This implies that the decision regions of different classification systems can be similar. However, they do not say much about how the decision regions of different classifiers differ from each other and whether we can arbitrarily manipulate the labels of them by designing the perturbation. This motivates us to study the question of selective fooling, i.e., given multiple machine learning systems for solving the same classification task, is it possible to construct a perturbation to the input sample so that the outputs of the multiple machine learning systems can be arbitrarily manipulated simultaneously? We formulate the problem of “selective fooling” as a novel optimization problem, and conduct experiments on the MNIST dataset. Our results show that it is in fact very easy to perform selective adversarial attack when the classifiers are identical in their architectures, training algorithms and training datasets except for random initialization in training. This suggests that multiple machine learning systems which can achieve the same level of high classification accuracy, do not in fact “think alike” at all, and their decision regions can differ greatly from each other. In the regression task, we focus on the robustness of linear sparse regression and its variantswhich are essentially signal recovery problems such as super-resolution or spectrally sparse during the training or learning process. In spectrally sparse signal recovery, we investigate the robustness of recovering the signal against basis mismatch between the underlying frequencies of the signal and the on-grid frequencies. We propose a Hankel matrix recovery approach for solving the problem, and show that the proposed approach can super-resolve the complex exponentials and identify their frequencies from compressed non-uniform measurements, regardless of the distance among the underlying frequencies. We propose a new concept of orthonormal atomic norm minimization (OANM), and argue that the success of Hankel matrix recovery in separation-free super-resolution and its robustness to the basis match are due to the fact that the nuclear norm of a Hankel matrix is essentially the orthonormal atomic norm. We show that the underlying parameter or frequencies values must be well separated apart for the traditional atomic norm minimization (ANM) approach to achieve successful signal recovery when the atoms are continuous functions of the continuously-valued parameter. However, the OANM can still succeed when the original atoms are arbitrarily close. As a byproduct of this research, we provide one matrix-theoretic inequality of nuclear norm, and give its proof using the theory of compressed sensing. Our experimental results demonstrate the superiority of the proposed method over the ANM. In outlier detection, we look into the robustness of recovering the ground truth signal againstthe sparse outliers. We propose a generative model approach for recovering the ground truth signals, and design efficient algorithms for recovering the ground truth signal. Different from traditional L1-L1 minimization approach where the sparsity of the ground truth signal is required, our approach is complete free of the sparsity assumption. Instead, the generated signal recovery, outlier detection, low-rank matrix recovery, and error correction coding. Our attention is attracted by the robustness of such systems to random variations or perturbations model is used to learn and extract useful structural information for signal recovery. We establish the guarantees for the recovery of signals using learned generative models, thus achieving the robustness of signal recovery against the outliers. Our results are applicable in both linear and nonlinear generative models, and we conduct extensive experiments on real datasets using variational auto-encoder (VAE) and deep convolutional generative adversarial networks (DCGAN). The experimental results show that the signals can be successfully recovered under outliers using our approach, and it outperforms the traditional Lasso and L2 minimization approach. In separation-free superresolution and outlier detection problem, the null space conditionplays a critical role in establishing the recovery guarantees, and this motivates us to generalize the null space condition for signal recovery to that for matrix recovery. Oymak et al. established a null space condition for successful recovery of a given low-rank matrix (the weak null space condition) using nuclear norm minimization, and derived the phase transition for the nuclear norm minimization. we show that the weak null space condition proposed by Oymak et al. is only a sufficient condition for successful matrix recovery using nuclear norm minimization, and is not a necessary condition as claimed. We further give a weak null space condition which is both necessary and sufficient for the success of nuclear norm minimization. Finally, we consider the error correction coding problems in the virus testing application,and we investigate the robustness of efficient virus testing via pooling tests against the noise in the pooling process. we propose a novel method to increase the reliability and robustness of COVID-19 virus or antibody tests by using specially designed pooling strategy. The ideas of our approach come from compressed sensing and error-correction coding which can correct for a certain number of errors in the test results. We present simulations and theoretical arguments to show that our method is significantly more efficient in improving diagnostic accuracy than individual testing, and the results run against traditional beliefs that, “even though pooled testing increased test capacity, pooled testings were less reliable than testing individuals separately.”

Towards adversarial and non-adversarial robustness of machine learning and signal processing: Fundamental limits and algorithms

Files and links (1)

Abstract

Details

Metrics