Attacks

VANILA

class torchattacks.attacks.vanila.VANILA(model)[source]

Vanila version of Attack. It just returns the input images.

Parameters:: model (nn.Module) – model to attack.

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.VANILA(model)
>>> adv_images = attack(images, labels)

forward(images, labels=None)[source]: Overridden.

GN

class torchattacks.attacks.gn.GN(model, std=0.1)[source]

Add Gaussian Noise.

Parameters:

model (nn.Module) – model to attack.
std (nn.Module) – standard deviation (Default: 0.1).

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.GN(model)
>>> adv_images = attack(images, labels)

forward(images, labels=None)[source]: Overridden.

FGSM

class torchattacks.attacks.fgsm.FGSM(model, eps=0.03137254901960784)[source]

FGSM in the paper ‘Explaining and harnessing adversarial examples’ [https://arxiv.org/abs/1412.6572]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.FGSM(model, eps=8/255)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

BIM

class torchattacks.attacks.bim.BIM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10)[source]

BIM or iterative-FGSM in the paper ‘Adversarial Examples in the Physical World’ [https://arxiv.org/abs/1607.02533]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)

Note

If steps set to 0, steps will be automatically decided following the paper.

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.BIM(model, eps=8/255, alpha=2/255, steps=10)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

RFGSM

class torchattacks.attacks.rfgsm.RFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10)[source]

R+FGSM in the paper ‘Ensemble Adversarial Training : Attacks and Defences’ [https://arxiv.org/abs/1705.07204]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – strength of the attack or maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.RFGSM(model, eps=8/255, alpha=2/255, steps=10)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

PGD

class torchattacks.attacks.pgd.PGD(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, random_start=True)[source]

PGD in the paper ‘Towards Deep Learning Models Resistant to Adversarial Attacks’ [https://arxiv.org/abs/1706.06083]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)
random_start (bool) – using random initialization of delta. (Default: True)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.PGD(model, eps=8/255, alpha=1/255, steps=10, random_start=True)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

EOTPGD (EOT + PGD)

class torchattacks.attacks.eotpgd.EOTPGD(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, eot_iter=2, random_start=True)[source]

Comment on “Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network” [https://arxiv.org/abs/1907.00895]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)
eot_iter (int) – number of models to estimate the mean gradient. (Default: 2)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.EOTPGD(model, eps=8/255, alpha=2/255, steps=10, eot_iter=2)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

FFGSM (Fast’s FGSM)

class torchattacks.attacks.ffgsm.FFGSM(model, eps=0.03137254901960784, alpha=0.0392156862745098)[source]

New FGSM proposed in ‘Fast is better than free: Revisiting adversarial training’ [https://arxiv.org/abs/2001.03994]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 10/255)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.FFGSM(model, eps=8/255, alpha=10/255)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

TPGD (TRADES’ PGD)

class torchattacks.attacks.tpgd.TPGD(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10)[source]

PGD based on KL-Divergence loss in the paper ‘Theoretically Principled Trade-off between Robustness and Accuracy’ [https://arxiv.org/abs/1901.08573]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – strength of the attack or maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.TPGD(model, eps=8/255, alpha=2/255, steps=10)
>>> adv_images = attack(images)

forward(images, labels=None)[source]: Overridden.

MIFGSM

class torchattacks.attacks.mifgsm.MIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=1.0)[source]

MI-FGSM in the paper ‘Boosting Adversarial Attacks with Momentum’ [https://arxiv.org/abs/1710.06081]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
decay (float) – momentum factor. (Default: 1.0)
steps (int) – number of iterations. (Default: 10)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.MIFGSM(model, eps=8/255, steps=10, decay=1.0)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

UPGD

class torchattacks.attacks.upgd.UPGD(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, random_start=False, loss='ce', decay=1.0, eot_iter=1)[source]

Ultimate PGD that supports various options of gradient-based adversarial attacks.

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)
random_start (bool) – using random initialization of delta. (Default: False)
loss (str) – loss function. [‘ce’, ‘margin’, ‘dlr’] (Default: ‘ce’)
decay (float) – momentum factor. (Default: 1.0)
eot_iter (int) – number of models to estimate the mean gradient. (Default: 1)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.UPGD(model, eps=8/255, alpha=2/255, steps=10, random_start=False)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

APGD

class torchattacks.attacks.apgd.APGD(model, norm='Linf', eps=0.03137254901960784, steps=10, n_restarts=1, seed=0, loss='ce', eot_iter=1, rho=0.75, verbose=False)[source]

APGD in the paper ‘Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks’ [https://arxiv.org/abs/2003.01690] [https://github.com/fra31/auto-attack]

Distance Measure : Linf, L2

Parameters:

model (nn.Module) – model to attack.
norm (str) – Lp-norm of the attack. [‘Linf’, ‘L2’] (Default: ‘Linf’)
eps (float) – maximum perturbation. (Default: 8/255)
steps (int) – number of steps. (Default: 10)
n_restarts (int) – number of random restarts. (Default: 1)
seed (int) – random seed for the starting point. (Default: 0)
loss (str) – loss function optimized. [‘ce’, ‘dlr’] (Default: ‘ce’)
eot_iter (int) – number of iteration for EOT. (Default: 1)
rho (float) – parameter for step-size update (Default: 0.75)
verbose (bool) – print progress. (Default: False)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.APGD(model, norm='Linf', eps=8/255, steps=10, n_restarts=1, seed=0, loss='ce', eot_iter=1, rho=.75, verbose=False)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

APGDT

class torchattacks.attacks.apgdt.APGDT(model, norm='Linf', eps=0.03137254901960784, steps=10, n_restarts=1, seed=0, eot_iter=1, rho=0.75, verbose=False, n_classes=10)[source]

APGD-Targeted in the paper ‘Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks.’ Targeted attack for every wrong classes. [https://arxiv.org/abs/2003.01690] [https://github.com/fra31/auto-attack]

Distance Measure : Linf, L2

Parameters:

model (nn.Module) – model to attack.
norm (str) – Lp-norm of the attack. [‘Linf’, ‘L2’] (Default: ‘Linf’)
eps (float) – maximum perturbation. (Default: 8/255)
steps (int) – number of steps. (Default: 10)
n_restarts (int) – number of random restarts. (Default: 1)
seed (int) – random seed for the starting point. (Default: 0)
eot_iter (int) – number of iteration for EOT. (Default: 1)
rho (float) – parameter for step-size update (Default: 0.75)
verbose (bool) – print progress. (Default: False)
n_classes (int) – number of classes. (Default: 10)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.APGDT(model, norm='Linf', eps=8/255, steps=10, n_restarts=1, seed=0, eot_iter=1, rho=.75, verbose=False, n_classes=10)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

DIFGSM

class torchattacks.attacks.difgsm.DIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=0.0, resize_rate=0.9, diversity_prob=0.5, random_start=False)[source]

DI2-FGSM in the paper ‘Improving Transferability of Adversarial Examples with Input Diversity’ [https://arxiv.org/abs/1803.06978]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
decay (float) – momentum factor. (Default: 0.0)
steps (int) – number of iterations. (Default: 10)
resize_rate (float) – resize factor used in input diversity. (Default: 0.9)
diversity_prob (float) – the probability of applying input diversity. (Default: 0.5)
random_start (bool) – using random initialization of delta. (Default: False)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.DIFGSM(model, eps=8/255, alpha=2/255, steps=10, decay=0.0, resize_rate=0.9, diversity_prob=0.5, random_start=False)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

TIFGSM

class torchattacks.attacks.tifgsm.TIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=0.0, kernel_name='gaussian', len_kernel=15, nsig=3, resize_rate=0.9, diversity_prob=0.5, random_start=False)[source]

TIFGSM in the paper ‘Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks’ [https://arxiv.org/abs/1904.02884]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of iterations. (Default: 10)
decay (float) – momentum factor. (Default: 0.0)
kernel_name (str) – kernel name. (Default: gaussian)
len_kernel (int) – kernel length. (Default: 15, which is the best according to the paper)
nsig (int) – radius of gaussian kernel. (Default: 3; see Section 3.2.2 in the paper for explanation)
resize_rate (float) – resize factor used in input diversity. (Default: 0.9)
diversity_prob (float) – the probability of applying input diversity. (Default: 0.5)
random_start (bool) – using random initialization of delta. (Default: False)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.TIFGSM(model, eps=8/255, alpha=2/255, steps=10, decay=1.0, resize_rate=0.9, diversity_prob=0.7, random_start=False)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

gkern(kernlen=15, nsig=3)[source]: Returns a 2D Gaussian kernel array.

Jitter

class torchattacks.attacks.jitter.Jitter(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, scale=10, std=0.1, random_start=True)[source]

Jitter in the paper ‘Exploring Misclassifications of Robust Neural Networks to Enhance Adversarial Attacks’ [https://arxiv.org/abs/2105.10304]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 10)
random_start (bool) – using random initialization of delta. (Default: True)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.Jitter(model, eps=8/255, alpha=2/255, steps=10,
         scale=10, std=0.1, random_start=True)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

NIFGSM

class torchattacks.attacks.nifgsm.NIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=1.0)[source]

NI-FGSM in the paper ‘NESTEROV ACCELERATED GRADIENT AND SCALEINVARIANCE FOR ADVERSARIAL ATTACKS’ [https://arxiv.org/abs/1908.06281], Published as a conference paper at ICLR 2020

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
decay (float) – momentum factor. (Default: 1.0)
steps (int) – number of iterations. (Default: 10)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.NIFGSM(model, eps=8/255, alpha=2/255, steps=10, decay=1.0)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

PGDRS

class torchattacks.attacks.pgdrs.PGDRS(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, noise_type='guassian', noise_sd=0.5, noise_batch_size=5, batch_max=2048)[source]

PGD for randmized smoothing in the paper ‘Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers’ [https://arxiv.org/abs/1906.04584] Modification of the code from https://github.com/Hadisalman/smoothing-adversarial/blob/master/code/attacks.py

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of steps. (Default: 40)
noise_type (str) – guassian or uniform. (Default: guassian)
noise_sd (float) – standard deviation for normal distributio, or range for . (Default: 0.5)
noise_batch_size (int) – guassian or uniform. (Default: 5)
batch_max (int) – split data into small chunk if the total number of augmented data points, len(inputs)*noise_batch_size, are larger than batch_max, in case GPU memory is insufficient. (Default: 2048)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.PGDRS(model, eps=8/255, alpha=2/255, steps=10, noise_type="guassian", noise_sd=0.5, noise_batch_size=5, batch_max=2048)
>>> adv_images = attack(images, labels)

forward(inputs: torch.Tensor, labels: torch.Tensor) → torch.Tensor[source]: It defines the computation performed at every call. Should be overridden by all subclasses.

SINIFGSM

class torchattacks.attacks.sinifgsm.SINIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=1.0, m=5)[source]

SI-NI-FGSM in the paper ‘NESTEROV ACCELERATED GRADIENT AND SCALEINVARIANCE FOR ADVERSARIAL ATTACKS’ [https://arxiv.org/abs/1908.06281], Published as a conference paper at ICLR 2020 Modified from “https://githuba.com/JHL-HUST/SI-NI-FGSM”

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of iterations. (Default: 10)
decay (float) – momentum factor. (Default: 1.0)
m (int) – number of scale copies. (Default: 5)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.SINIFGSM(model, eps=8/255, alpha=2/255, steps=10, decay=1.0, m=5)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

VMIFGSM

class torchattacks.attacks.vmifgsm.VMIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=1.0, N=5, beta=1.5)[source]

VMI-FGSM in the paper ‘Enhancing the Transferability of Adversarial Attacks through Variance Tuning [https://arxiv.org/abs/2103.15571], Published as a conference paper at CVPR 2021 Modified from “https://github.com/JHL-HUST/VT”

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
steps (int) – number of iterations. (Default: 10)
alpha (float) – step size. (Default: 2/255)
decay (float) – momentum factor. (Default: 1.0)
N (int) – the number of sampled examples in the neighborhood. (Default: 5)
beta (float) – the upper bound of neighborhood. (Default: 3/2)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.VMIFGSM(model, eps=8/255, alpha=2/255, steps=10, decay=1.0, N=5, beta=3/2)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

VNIFGSM

class torchattacks.attacks.vnifgsm.VNIFGSM(model, eps=0.03137254901960784, alpha=0.00784313725490196, steps=10, decay=1.0, N=5, beta=1.5)[source]

VNI-FGSM in the paper ‘Enhancing the Transferability of Adversarial Attacks through Variance Tuning [https://arxiv.org/abs/2103.15571], Published as a conference paper at CVPR 2021 Modified from “https://github.com/JHL-HUST/VT”

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
alpha (float) – step size. (Default: 2/255)
steps (int) – number of iterations. (Default: 10)
decay (float) – momentum factor. (Default: 1.0)
N (int) – the number of sampled examples in the neighborhood. (Default: 5)
beta (float) – the upper bound of neighborhood. (Default: 3/2)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.VNIFGSM(model, eps=8/255, alpha=2/255, steps=10, decay=1.0, N=5, beta=3/2)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

CW

class torchattacks.attacks.cw.CW(model, c=1, kappa=0, steps=50, lr=0.01)[source]

CW in the paper ‘Towards Evaluating the Robustness of Neural Networks’ [https://arxiv.org/abs/1608.04644]

Distance Measure : L2

Parameters:

model (nn.Module) – model to attack.
c (float) – c in the paper. parameter for box-constraint. (Default: 1) \(minimize \Vert\frac{1}{2}(tanh(w)+1)-x\Vert^2_2+c\cdot f(\frac{1}{2}(tanh(w)+1))\)
kappa (float) – kappa (also written as ‘confidence’) in the paper. (Default: 0) \(f(x')=max(max\{Z(x')_i:i\neq t\} -Z(x')_t, - \kappa)\)
steps (int) – number of steps. (Default: 50)
lr (float) – learning rate of the Adam optimizer. (Default: 0.01)

Warning

With default c, you can’t easily get adversarial images. Set higher c like 1.

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.CW(model, c=1, kappa=0, steps=50, lr=0.01)
>>> adv_images = attack(images, labels)

Note

Binary search for c is NOT IMPLEMENTED methods in the paper due to time consuming.

forward(images, labels)[source]: Overridden.

PGDL2

class torchattacks.attacks.pgdl2.PGDL2(model, eps=1.0, alpha=0.2, steps=10, random_start=True, eps_for_division=1e-10)[source]

PGD in the paper ‘Towards Deep Learning Models Resistant to Adversarial Attacks’ [https://arxiv.org/abs/1706.06083]

Distance Measure : L2

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 1.0)
alpha (float) – step size. (Default: 0.2)
steps (int) – number of steps. (Default: 10)
random_start (bool) – using random initialization of delta. (Default: True)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.PGDL2(model, eps=1.0, alpha=0.2, steps=10, random_start=True)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

PGDRSL2

class torchattacks.attacks.pgdrsl2.PGDRSL2(model, eps=1.0, alpha=0.2, steps=10, noise_type='guassian', noise_sd=0.5, noise_batch_size=5, batch_max=2048, eps_for_division=1e-10)[source]

PGD for randmized smoothing in the paper ‘Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers’ [https://arxiv.org/abs/1906.04584] Modification of the code from https://github.com/Hadisalman/smoothing-adversarial/blob/master/code/attacks.py

Distance Measure : L2

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 1.0)
alpha (float) – step size. (Default: 0.2)
steps (int) – number of steps. (Default: 10)
noise_type (str) – guassian or uniform. (Default: guassian)
noise_sd (float) – standard deviation for normal distributio, or range for . (Default: 0.5)
noise_batch_size (int) – guassian or uniform. (Default: 5)
batch_max (int) – split data into small chunk if the total number of augmented data points, len(inputs)*noise_batch_size, are larger than batch_max, in case GPU memory is insufficient. (Default: 2048)
random_start (bool) – using random initialization of delta. (Default: True)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.PGDRSL2(model, eps=1.0, alpha=0.2, steps=10, noise_type="guassian", noise_sd=0.5, noise_batch_size=5, batch_max=2048)
>>> adv_images = attack(images, labels)

forward(inputs: torch.Tensor, labels: torch.Tensor) → torch.Tensor[source]: It defines the computation performed at every call. Should be overridden by all subclasses.

DeepFool

class torchattacks.attacks.deepfool.DeepFool(model, steps=50, overshoot=0.02)[source]

‘DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks’ [https://arxiv.org/abs/1511.04599] Distance Measure : L2 :param model: model to attack. :type model: nn.Module :param steps: number of steps. (Default: 50) :type steps: int :param overshoot: parameter for enhancing the noise. (Default: 0.02) :type overshoot: float

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.DeepFool(model, steps=50, overshoot=0.02)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

forward_return_target_labels(images, labels)[source]: Overridden.

SparseFool

class torchattacks.attacks.sparsefool.SparseFool(model, steps=10, lam=3, overshoot=0.02)[source]

Attack in the paper ‘SparseFool: a few pixels make a big difference’ [https://arxiv.org/abs/1811.02248]

Modified from “https://github.com/LTS4/SparseFool/”

Distance Measure : L0

Parameters:

model (nn.Module) – model to attack.
steps (int) – number of steps. (Default: 10)
lam (float) – parameter for scaling DeepFool noise. (Default: 3)
overshoot (float) – parameter for enhancing the noise. (Default: 0.02)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.SparseFool(model, steps=10, lam=3, overshoot=0.02)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

OnePixel

class torchattacks.attacks.onepixel.OnePixel(model, pixels=1, steps=10, popsize=10, inf_batch=128)[source]

Attack in the paper ‘One pixel attack for fooling deep neural networks’ [https://arxiv.org/abs/1710.08864]

Modified from “https://github.com/DebangLi/one-pixel-attack-pytorch/” and “https://github.com/sarathknv/adversarial-examples-pytorch/blob/master/one_pixel_attack/”

Distance Measure : L0

Parameters:

model (nn.Module) – model to attack.
pixels (int) – number of pixels to change (Default: 1)
steps (int) – number of steps. (Default: 10)
popsize (int) – population size, i.e. the number of candidate agents or “parents” in differential evolution (Default: 10)
inf_batch (int) – maximum batch size during inference (Default: 128)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.OnePixel(model, pixels=1, steps=10, popsize=10, inf_batch=128)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

Pixle

class torchattacks.attacks.pixle.Pixle(model, x_dimensions=(2, 10), y_dimensions=(2, 10), pixel_mapping='random', restarts=20, max_iterations=10, update_each_iteration=False)[source]

Pixle: a fast and effective black-box attack based on rearranging pixels’ [https://arxiv.org/abs/2202.02236]

Distance Measure : L0

Parameters:

model (nn.Module) – model to attack.
x_dimensions (int or float, or a tuple containing a combination of those) – size of the sampled patch along ther x side for each iteration. The integers are considered as fixed number of size,
(Default (while the float as parcentage of the size. A tuple is used to specify both under and upper bound of the size.) – (2, 10))
y_dimensions (int or float, or a tuple containing a combination of those) – size of the sampled patch along ther y side for each iteration. The integers are considered as fixed number of size,
(Default – (2, 10))
pixel_mapping (str) – the type of mapping used to move the pixels. Can be: ‘random’, ‘similarity’, ‘similarity_random’, ‘distance’, ‘distance_random’ (Default: random)
restarts (int) – the number of restarts that the algortihm performs. (Default: 20)
max_iterations (int) – number of iterations to perform for each restart. (Default: 10)
update_each_iteration (bool) – if the attacked images must be modified after each iteration (True) or after each restart (False). (Default: False)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.Pixle(model, x_dimensions=(0.1, 0.2), restarts=10, max_iterations=50)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: It defines the computation performed at every call. Should be overridden by all subclasses.

FAB

class torchattacks.attacks.fab.FAB(model, norm='Linf', eps=0.03137254901960784, steps=10, n_restarts=1, alpha_max=0.1, eta=1.05, beta=0.9, verbose=False, seed=0, multi_targeted=False, n_classes=10)[source]

Fast Adaptive Boundary Attack in the paper ‘Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack’ [https://arxiv.org/abs/1907.02044] [https://github.com/fra31/auto-attack]

Distance Measure : Linf, L2, L1

Parameters:

model (nn.Module) – model to attack.
norm (str) – Lp-norm to minimize. [‘Linf’, ‘L2’, ‘L1’] (Default: ‘Linf’)
eps (float) – maximum perturbation. (Default: 8/255)
steps (int) – number of steps. (Default: 10)
n_restarts (int) – number of random restarts. (Default: 1)
alpha_max (float) – alpha_max. (Default: 0.1)
eta (float) – overshooting. (Default: 1.05)
beta (float) – backward step. (Default: 0.9)
verbose (bool) – print progress. (Default: False)
seed (int) – random seed for the starting point. (Default: 0)
targeted (bool) – targeted attack for every wrong classes. (Default: False)
n_classes (int) – number of classes. (Default: 10)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.FAB(model, norm='Linf', steps=10, eps=8/255, n_restarts=1, alpha_max=0.1, eta=1.05, beta=0.9, loss_fn=None, verbose=False, seed=0, targeted=False, n_classes=10)
>>> adv_images = attack(images, labels)

attack_single_run(x, y=None, use_rand_start=False)[source]

Parameters:

x – clean images
y – clean labels, if None we use the predicted labels

attack_single_run_targeted(x, y=None, use_rand_start=False)[source]

Parameters:

x – clean images
y – clean labels, if None we use the predicted labels

forward(images, labels)[source]: Overridden.

AutoAttack

class torchattacks.attacks.autoattack.AutoAttack(model, norm='Linf', eps=0.03137254901960784, version='standard', n_classes=10, seed=None, verbose=False)[source]

AutoAttack in the paper ‘Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks’ [https://arxiv.org/abs/2003.01690] [https://github.com/fra31/auto-attack]

Distance Measure : Linf, L2

Parameters:

model (nn.Module) – model to attack.
norm (str) – Lp-norm to minimize. [‘Linf’, ‘L2’] (Default: ‘Linf’)
eps (float) – maximum perturbation. (Default: 0.3)
version (bool) – version. [‘standard’, ‘plus’, ‘rand’] (Default: ‘standard’)
n_classes (int) – number of classes. (Default: 10)
seed (int) – random seed for the starting point. (Default: 0)
verbose (bool) – print progress. (Default: False)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.AutoAttack(model, norm='Linf', eps=8/255, version='standard', n_classes=10, seed=None, verbose=False)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

Square

class torchattacks.attacks.square.Square(model, norm='Linf', eps=0.03137254901960784, n_queries=5000, n_restarts=1, p_init=0.8, loss='margin', resc_schedule=True, seed=0, verbose=False)[source]

Square Attack in the paper ‘Square Attack: a query-efficient black-box adversarial attack via random search’ [https://arxiv.org/abs/1912.00049] [https://github.com/fra31/auto-attack]

Distance Measure : Linf, L2

Parameters:

model (nn.Module) – model to attack.
norm (str) – Lp-norm of the attack. [‘Linf’, ‘L2’] (Default: ‘Linf’)
eps (float) – maximum perturbation. (Default: 8/255)
n_queries (int) – max number of queries (each restart). (Default: 5000)
n_restarts (int) – number of random restarts. (Default: 1)
p_init (float) – parameter to control size of squares. (Default: 0.8)
loss (str) – loss function optimized [‘margin’, ‘ce’] (Default: ‘margin’)
resc_schedule (bool) – adapt schedule of p to n_queries (Default: True)
seed (int) – random seed for the starting point. (Default: 0)
verbose (bool) – print progress. (Default: False)
targeted (bool) – targeted. (Default: False)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.Square(model, model, norm='Linf', eps=8/255, n_queries=5000, n_restarts=1, eps=None, p_init=.8, seed=0, verbose=False, targeted=False, loss='margin', resc_schedule=True)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

margin_and_loss(x, y)[source]

Parameters:: y – correct labels if untargeted else target labels

p_selection(it)[source]: schedule to decrease the parameter p

perturb(x, y=None)[source]

Parameters:

x – clean images
y – untargeted attack -> clean labels, if None we use the predicted labels targeted attack -> target labels, if None random classes, different from the predicted ones, are sampled

SPSA

Code is from https://github.com/BorealisAI/advertorch/blob/master/advertorch/attacks/spsa.py

class torchattacks.attacks.spsa.MarginalLoss(*args: Any, **kwargs: Any)[source]

class torchattacks.attacks.spsa.SPSA(model, eps=0.3, delta=0.01, lr=0.01, nb_iter=1, nb_sample=128, max_batch_size=64)[source]

SPSA in the paper ‘Adversarial Risk and the Dangers of Evaluating Against Weak Attacks’ [https://arxiv.org/abs/1802.05666]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
eps (float) – maximum perturbation. (Default: 8/255)
delta (float) – scaling parameter of SPSA. (Default: 0.01)
lr (float) – the learning rate of the Adam optimizer. (Default: 0.01)
nb_iter (int) – number of iterations of the attack. (Default: 1)
nb_sample (int) – number of samples for SPSA gradient approximation. (Default: 128)
max_batch_size (int) – maximum batch size to be evaluated at once. (Default: 64)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.SPSA(model, eps=0.3)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

linf_clamp_(dx, x, eps)[source]

Clamps perturbation dx to fit L_inf norm and image bounds.

Limit the L_inf norm of dx to be <= eps, and the bounds of x + dx to be in [clip_min, clip_max].

Return: the clamped perturbation dx.

spsa_grad(images, labels, delta, nb_sample, max_batch_size)

Uses SPSA method to apprixmate gradient w.r.t x.

Use the SPSA method to approximate the gradient of loss(predict(x), y) with respect to x, based on the nonce v.

Return the approximated gradient of loss_fn(predict(x), y) with respect to x.

JSMA

class torchattacks.attacks.jsma.JSMA(model, theta=1.0, gamma=0.1)[source]

Jacobian Saliency Map Attack in the paper ‘The Limitations of Deep Learning in Adversarial Settings’ [https://arxiv.org/abs/1511.07528v1]

Distance Measure : L0

Parameters:

model (nn.Module) – model to attack.
theta (float) – perturb length, range is either [theta, 0], [0, theta]. (Default: 1.0)
gamma (float) – highest percentage of pixels can be modified. (Default: 0.1)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.JSMA(model, theta=1.0, gamma=0.1)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

perturbation_single(image, target_label)[source]: image: only one element label: only one element

EADL1

class torchattacks.attacks.eadl1.EADL1(model, kappa=0, lr=0.01, binary_search_steps=9, max_iterations=100, abort_early=True, initial_const=0.001, beta=0.001)[source]

EAD attack in the paper ‘EAD: Elastic-Net Attacks to Deep Neural Networks’ [https://arxiv.org/abs/1709.04114]

Distance Measure : L1

Parameters:

model (nn.Module) – model to attack.
kappa (float) – how strong the adversarial example should be (also written as ‘confidence’). (Default: 0)
lr (float) – larger values converge faster to less accurate results. (Default: 0.01)
binary_search_steps (int) – number of times to adjust the constant with binary search. (Default: 9)
max_iterations (int) – number of iterations to perform gradient descent. (Default: 100)
abort_early (bool) – if we stop improving, abort gradient descent early. (Default: True)
initial_const (float) – the initial constant c to pick as a first guess. (Default: 0.001)
beta (float) – hyperparameter trading off L2 minimization for L1 minimization. (Default: 0.001)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.EADL1(model, kappa=0, lr=0.01, max_iterations=100)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

EADEN

class torchattacks.attacks.eaden.EADEN(model, kappa=0, lr=0.01, binary_search_steps=9, max_iterations=100, abort_early=True, initial_const=0.001, beta=0.001)[source]

EAD attack in the paper ‘EAD: Elastic-Net Attacks to Deep Neural Networks’ [https://arxiv.org/abs/1709.04114]

Distance Measure : L1 and L2

Parameters:

model (nn.Module) – model to attack.
kappa (float) – how strong the adversarial example should be (also written as ‘confidence’). (Default: 0)
lr (float) – larger values converge faster to less accurate results. (Default: 0.01)
binary_search_steps (int) – number of times to adjust the constant with binary search. (Default: 9)
max_iterations (int) – number of iterations to perform gradient descent. (Default: 100)
abort_early (bool) – if we stop improving, abort gradient descent early. (Default: True)
initial_const (float) – the initial constant c to pick as a first guess. (Default: 0.001)
beta (float) – hyperparameter trading off L2 minimization for L1 minimization. (Default: 0.001)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.EADEN(model, kappa=0, lr=0.01, max_iterations=100)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

PIFGSM

class torchattacks.attacks.pifgsm.PIFGSM(model, max_epsilon=0.06274509803921569, num_iter_set=10, momentum=1.0, amplification=10.0, prob=0.7)[source]

PIFGSM in the paper ‘Patch-wise Attack for Fooling Deep Neural Network’ [https://arxiv.org/abs/2007.06765]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
max_epsilon (float) – maximum size of adversarial perturbation. (Default: 16/255)
num_iter_set (float) – number of iterations. (Default: 10)
momentum (float) – momentum. (Default: 1.0)
amplification (float) – to amplifythe step size. (Default: 10.0)
prob (float) – probability of using diverse inputs. (Default: 0.7)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.PIFGSM(model, eps=16/255, num_iter_set=10)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

PIFGSM++

class torchattacks.attacks.pifgsmpp.PIFGSMPP(model, max_epsilon=0.06274509803921569, num_iter_set=10, momentum=1.0, amplification=10.0, prob=0.7, project_factor=0.8)[source]

Patch-wise++ Perturbation for Adversarial Targeted Attacks’ [https://arxiv.org/abs/2012.15503]

Distance Measure : Linf

Parameters:

model (nn.Module) – model to attack.
max_epsilon (float) – maximum size of adversarial perturbation. (Default: 16/255)
num_iter_set (float) – number of iterations. (Default: 10)
momentum (float) – momentum. (Default: 1.0)
amplification (float) – to amplifythe step size. (Default: 10.0)
prob (float) – probability of using diverse inputs. (Default: 0.7)
project_factor (float) – To control the weight of project term. (Default: 0.8)

Shape:

images: \((N, C, H, W)\) where N = number of batches, C = number of channels, H = height and W = width. It must have a range [0, 1].
labels: \((N)\) where each value \(y_i\) is \(0 \leq y_i \leq\) number of labels.
output: \((N, C, H, W)\).

Examples::

>>> attack = torchattacks.PIFGSMPP(model, eps=16/255, num_iter_set=10)
>>> adv_images = attack(images, labels)

forward(images, labels)[source]: Overridden.

gaussian_kern(kernlen=21, nsig=3)[source]: Returns a 2D Gaussian kernel array.