Home OpenAI LoRID: A Breakthrough Low-Rank Iterative Diffusion Method for Adversarial Noise Removal
OpenAI

LoRID: A Breakthrough Low-Rank Iterative Diffusion Method for Adversarial Noise Removal

Share
LoRID: A Breakthrough Low-Rank Iterative Diffusion Method for Adversarial Noise Removal
Share


Neural networks are widely adopted in various fields due to their ability to model complex patterns and relationships. However, they face a critical vulnerability to adversarial attacks – small, malicious input changes that cause unpredictable outputs. This issue poses significant challenges to the reliability and security of machine learning models across various applications. While several defense methods like adversarial training and purification have been developed, they often fail to provide robust protection against sophisticated attacks. The rise of diffusion models has led to diffusion-based adversarial purifications, enhancing robustness. However, these methods face challenges like computational complexities and the risk of new attack strategies that can weaken model defenses.

One of the existing methods to address adversarial attacks includes Denoising Diffusion Probabilistic Models (DDPMs), a class of generative models that add noise to input signals during training, and then learn to denoise from the resulting noisy signal. Other approaches include Diffusion models as adversarial purifiers which come under Markov-based purification (or DDPM-based), and Score-based purification. It introduces a guided term to preserve sample semantics and DensePure, which uses multiple reversed samples and majority voting for final predictions. Lastly, Tucker Decomposition, a method for analyzing high-dimensional data arrays, has shown potential in feature extraction, presenting a potential path for enhancing adversarial purification techniques.

Researchers from the Theoretical Division and Computational Sciences at Los Alamos National Laboratory, Los Alamos, NM have proposed LoRID, a novel Low-Rank Iterative Diffusion purification method designed to remove adversarial perturbations with low intrinsic purification errors. LoRID overcomes the limitations of current diffusion-based purification methods by providing a theoretical understanding of the purification errors associated with Markov-based diffusion methods. Moreover, it utilizes a multistage purification process, that integrates multiple rounds of diffusion-denoising loops at early time steps of diffusion models with Tucker decomposition. This integration removes the adversarial noise in high-noise regimes and enhances the robustness against strong adversarial attacks.

LoRID’s architecture is evaluated on multiple datasets including CIFAR-10/100, CelebA-HQ, and ImageNet, comparing its performance against state-of-the-art (SOTA) defense methods. It utilizes WideResNet for classification, evaluating both standard and robust accuracy. LoRID’s performance is tested under two threat models: black-box and white-box attacks. In the black-box, the attacker knows only the classifier, while in the white-box setting, the attacker has full knowledge of both the classifier and the purification scheme. The proposed method is evaluated against AutoAttack for CIFAR-10/100 and BPDA+EOT for CelebA-HQ in black-box settings, and AutoAttack and PGD+EOT in white-box scenarios.

The evaluated results demonstrated the superior performance of LoRID across multiple datasets and attack scenarios. It significantly enhances standard and robust accuracy against AutoAttacks in black-box and white-box settings on CIFAR-10. For example, it enhances black-box robust accuracy by 23.15% on WideResNet-28-10 and 4.27% on WideResNet-70-16. For CelebA-HQ, LoRID outperforms the best baseline by 7.17% in robust accuracy while maintaining high standard accuracy against BPDA+EOT attacks. At high noise levels (ϵ = 32/255), its robustness exceeds SOTA performance at standard noise levels (ϵ = 8/255) by 12.8%, showing its outstanding potential in handling critical adversarial perturbations.

In conclusion, researchers have introduced LoRID, an innovative defense strategy against adversarial attacks that utilizes multiple looping in the early stages of diffusion models to purify adversarial examples. This approach is further enhanced by integrating Tucker decomposition, which is effective in high noise regimes. LoRID’s effectiveness has been validated through theoretical analysis and detailed experimental evaluations across diverse datasets like CIFAR-10/100, ImageNet, and CelebA-HQ. The evaluated result proves LoRID’s potential as a promising advancement in the adversarial defense field, providing enhanced protection for neural networks against a wide range of complex attack strategies.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 50k+ ML SubReddit

⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)


Sajjad Ansari is a final year undergraduate from IIT Kharagpur. As a Tech enthusiast, he delves into the practical applications of AI with a focus on understanding the impact of AI technologies and their real-world implications. He aims to articulate complex AI concepts in a clear and accessible manner.





Source link

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
Shanghai AI Lab Releases OREAL-7B and OREAL-32B: Advancing Mathematical Reasoning with Outcome Reward-Based Reinforcement Learning
OpenAI

Shanghai AI Lab Releases OREAL-7B and OREAL-32B: Advancing Mathematical Reasoning with Outcome Reward-Based Reinforcement Learning

Mathematical reasoning remains a difficult area for artificial intelligence (AI) due to...

Vintix: Scaling In-Context Reinforcement Learning for Generalist AI Agents
OpenAI

Vintix: Scaling In-Context Reinforcement Learning for Generalist AI Agents

Developing AI systems that learn from their surroundings during execution involves creating...

LLMDet: How Large Language Models Enhance Open-Vocabulary Object Detection
OpenAI

LLMDet: How Large Language Models Enhance Open-Vocabulary Object Detection

Open-vocabulary object detection (OVD) aims to detect arbitrary objects with user-provided text...

Zyphra Introduces the Beta Release of Zonos: A Highly Expressive TTS Model with High Fidelity Voice Cloning
OpenAI

Zyphra Introduces the Beta Release of Zonos: A Highly Expressive TTS Model with High Fidelity Voice Cloning

Text-to-speech (TTS) technology has made significant strides in recent years, but challenges...