Deep Learning Security

Is Deep Learning Secure for Robots?

Han Wu, Dr. Johan Wahlström and Dr. Sareh Rowlands, University of Exeter

Source Code

Adversarial Driving

Attacking End-to-End Driving Models

Source Code

CARLA Autonomous Driving Leaderboard

Almost all the top 10 teams on the leaderboard use end-to-end driving models.
End-to-End driving models lead to smaller systems and better performance.

Adversarial Attacks against End-to-End Driving

The NVIDIA End-to-End Driving Model

Adversarial attacks against image classification ^[1]

Adversarial attacks against object detection

[1] J. Z. Kolter and A. Madry, Adversarial Robustness - Theory and Practice, NeurIPS 2018 tutorial.

Problem Definition

Given an input image 𝑥 and the end-to-end driving model $ y = f(\theta, x) $.
Our objective is to generate an adversarial image $ x^{'} = x + \eta $ such that:
To ensure that the perturbation is unperceivable to human eyes:
For offline attacks, we can use pre-recorded human drivers' steering angles as the ground truth $y^*$.
For a real-time online attack, we do not have access to the ground truth $y^*$.

Random Noises

Image-Specific Attack

Output: Steering Angle 𝑦∈[-1, 1]
Decrease the output (left):
Increase the output (right):

Image-Agnostic Attack

Adversarial Detection

Attacking Object Detection in Real Time

Source Code

Adversarial Filter

Adversarial Patch

Adversarial Overlay

How different attacks apply the perturbation $\delta$ using a binary mask $m \in \{0, 1\}^{wh}$

$x^{'}_{filter} = x + \delta$ $x^{'}_{overlay} = x + m \odot \delta$ $x^{'}_{patch} = (1-m) \odot x + m \odot \delta$

Prior research used adversarial filter and adversarial patch to fool object detection models.

The Adversarial Filter applies the perturbation to the entire input image. The perturbation is unperceivable by human eyes. While the Adversarial Patch applies the perturbation to a small region of the input image, but the perturbation is perceivable by human eyes. Besides, the adversarial patch can control where we fabricate objects, while the adversarial filter cannot.

By combining adversarial filters' imperceptibility and adversarial patches' localizability, we generate adversarial overlays, which means we generate human unperceivable perturbation at a small region of the input image. Mathematically, we summarize how different methods apply the pertubation in different ways.

Given an input image $x$, the object detection model outputs $S \times S$ candidate bounding boxes $o \in \mathcal{O}$ at three different scales.

Each candidate box $o^i$ contains $(b_x^i, b_y^i, b_w^i, b_h^i, c^i, p_1^i, p_2^i, ..., p_K^i)$ for K classes, where $0 \leq i \leq |\mathcal{O}|$.

$$\begin{aligned} \text{One-Targeted}:\ \mathcal{L}_{adv}^{1}(\mathcal{O}) &= \max_{1 \leq i \leq |\mathcal{O}|}\ [\sigma(c^i) * \sigma(p^i_t)] \\ \text{Multi-Targeted}:\ \mathcal{L}_{adv}^{2}(\mathcal{O}) &= \sum^{|\mathcal{O}|}_{i = 1}\ [\sigma(c^i) * \sigma(p^i_t)] \\ \text{Multi-Untargeted}:\ \mathcal{L}_{adv}^{3}(\mathcal{O}) &= \sum^{|\mathcal{O}|}_{i = 1} \sum_{j=1}^{K}\ [\sigma(c^i) *\sigma(p^i_j)] \end{aligned}$$

where $|\mathcal{O}| = \sum_{1 \leq i \leq 3} S_i \times S_i \times B$, and $S_i$ represents the grid size of the $i_{th}$ output layer ($S \in \{13,26,52\}$, $B=3$).

Adversarial Classification

Distributed Black-box Attacks against Image Classification

Source Code

DeepAPI - The Cloud API we attack

We open-source our image classification cloud service for research on black-box attacks.

DeepAPI Deployment

Using Docker

                        
                            $ docker run -p 8080:8080 wuhanstudio/deepapi
                            Serving on port 8080...

Using Pip

                        
                            $ pip install deepapi

                            $ python -m deepapi
                            Serving on port 8080...

How to accelerate Black-Box attacks?

Cloud APIs are deployed behind a load balancer that distributes the traffic across several servers.

Well, how can we accelerate Black-Box attacks?

Black-box attacks rely on queries, which is time consuming. Our experimental results demonstrate that sending out 10 queries concurrently takes roughly the same time as sending out 1 query, which means that we can accelerate black-box attacks by sending out queries concurrently. The more queries we send, the less time each query takes in average.

This is because modern cloud APIs are usually deployed behind a load balancer. The load balancer distributes the traffic across several servers, thus we can get query results of multiple concurrent requests simultaneously. (2min)

Before introducing the cloud service we attack, we notice that ...

Local Models & Cloud APIs

Most prior research used local models to test black-box attacks.

We initiate the black-box attacks directly against cloud services.

Most prior research used local models to test black-box attacks because sending queries to cloud services is slow, while querying a local model with GPU acceleration is much faster.

However, testing black-box attacks against local models could introduce several mistakes in the query process that gave their methtods an unfair advantage. For example, prior research usually resizes input images to be the same shape as the model input and then applies the perturbation, which means they assume they have access to the input shape of the model. Some methods outperformed the state-of-the-art partially because these mistakes gave them access to information that should not be assumed to be available in black-box attacks.

As a result, we initiate black-box attacks directly against cloud services to avoid making similar mistakes, and we apply the perturbation directly to the original input image. (3min)

Attacking Cloud APIs is more challenging than attacking local models

Attacking cloud APIs achieve less success rate than attacking local models.

Attacking cloud APIs requires more queries than attacking local models.

Horizontal Distribution

Horizontal distribution reduces the total attack time by a factor of five.

Vertical Distribution

Vertical distribution achieves succeesful attacks much earlier.

Conclusion

The Man-in-the-Middle Attack

A Hardware Attack against Object Detection

Source Code

Deep learning models are vulnerable to adversarial attacks.

To achieve real-time adversarial attacks, we need to solve two problems:

How to generate the perturbation? (The PCB Attack)
How to apply the perturbation? (The Man-in-the-Middle Attack)

Step 1: Generating the perturbation (The PCB Attack)

Objective:
Adversarial Loss:

Step 1: Generating the perturbation (The PCB Attack)

Prior Research Our Method

No learning rate decay With learning rate decay

Our method generates more bounding boxes, and have less variation.

Step 2: Applying the perturbation (The Man-in-the-Middle Attack)

Is Deep Learning Secure for Robots?

Thanks

https://research.wuhanstudio.uk

Source Code