Home OpenAI Unveiling Privacy Risks in Machine Unlearning: Reconstruction Attacks on Deleted Data

OpenAI

Unveiling Privacy Risks in Machine Unlearning: Reconstruction Attacks on Deleted Data

adminUpdated 1 day Ago3 Mins read2 Views

Unveiling Privacy Risks in Machine Unlearning: Reconstruction Attacks on Deleted Data

Machine unlearning is driven by the need for data autonomy, allowing individuals to request the removal of their data’s influence on machine learning models. This field complements data privacy efforts, which focus on preventing models from revealing sensitive information about the training data through attacks like membership inference or reconstruction. While differential privacy methods limit these risks, unlearning enables the deletion of data from a trained model, ensuring it behaves as if the data were never included in the first place. Achieving this efficiently, without retraining the entire model, has been a key focus, particularly for complex models like deep neural networks.

However, unlearning introduces new privacy risks. When adversaries compare a model’s parameters before and after data deletion, they can exploit the differences to reconstruct the deleted data, even for simple models like linear regression. This process leverages the gradient of the deleted sample and the expected Hessian derived from public data to approximate the changes caused by unlearning. The approach highlights a unique vulnerability where unlearning unintentionally exposes sensitive data. By extending existing techniques for gradient-based reconstruction attacks, this research reveals how unlearning can facilitate exact data reconstruction, emphasizing the importance of safeguards like differential privacy to mitigate these risks.

Researchers from AWS AI, the University of Pennsylvania, the University of Washington, Carnegie Mellon University, and Jump Trading reveal that data deletion in machine learning models, even simple ones, exposes individuals to high-accuracy reconstruction attacks. These attacks recover deleted data by exploiting differences in model parameters before and after deletion. The study demonstrates effective attacks on linear regression models using closed-form training algorithms and extends these methods to models with pre-trained embeddings and generic architectures via Newton’s method. Experiments on tabular and image datasets highlight significant privacy risks in retraining for unlearning without safeguards like differential privacy.

The researchers present an attack to reconstruct deleted user data from regularized linear regression models by analyzing parameter changes before and after deletion. The method leverages the relationship between model parameters and the removed sample, approximating key statistics using public data. The approach generalizes to models with fixed embeddings and extends to non-linear architectures using Newton’s approximation method. Experiments demonstrate its applicability to multiclass classification and label inference by estimating gradients and reconstructing deleted data. This highlights the vulnerability of models to privacy breaches, especially without safeguards, as the attack remains effective across various architectures and loss functions.

The study evaluates our attack across diverse datasets for classification and regression tasks, including tabular and image data. Using full retraining, they compare model parameters before and after a single sample’s deletion. Our method leverages public data from the same distribution without needing knowledge of the deleted sample. Against baselines like “Avg” (average of public samples) and “MaxDiff” (maximizing parameter change), our attack consistently outperforms, achieving higher cosine similarity with deleted samples. Tested on MNIST, CIFAR10, and ACS income data, our approach reconstructs deleted samples effectively across various models, emphasizing vulnerabilities in machine learning systems and the need for privacy safeguards.

In conclusion, The work introduces a reconstruction attack capable of recovering deleted data from simple machine-learning models with high accuracy. The attack achieves near-perfect results for linear regression and performs effectively on models using embeddings or optimizing different loss functions. Highlighting privacy risks in data deletion or machine unlearning, the findings emphasize the need for techniques like differential privacy. Counterintuitively, data deletion updates can increase vulnerability to reconstruction attacks, even in basic models, exposing sensitive data. Through extensive experiments on diverse datasets, this study underscores the significant privacy risks posed by data deletion requests, even in seemingly low-risk model settings.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Sana Hassan, a consulting intern at Marktechpost and dual-degree student at IIT Madras, is passionate about applying technology and AI to address real-world challenges. With a keen interest in solving practical problems, he brings a fresh perspective to the intersection of AI and real-life solutions.

🧵🧵 [Download] Evaluation of Large Language Model Vulnerabilities Report (Promoted)

Source link

Previous post Quasar-1: A Rigorous Mathematical Framework for Temperature-Guided Reasoning in Language Models

Next post Camel-AI Open Sourced OASIS: A Next Generation Simulator for Realistic Social Media Dynamics with One Million Agents

aiXplain Introduces a Multi-AI Agent Autonomous Framework for Optimizing Agentic AI Systems Across Diverse Industries and Applications

Agentic AI systems have revolutionized industries by enabling complex workflows through specialized...

admin4 Mins read

OpenAI

This AI Paper Explores How Formal Systems Could Revolutionize Math LLMs

Formal mathematical reasoning represents a significant frontier in artificial intelligence, addressing fundamental...

admin3 Mins read

OpenAI

Hypernetwork Fields: Efficient Gradient-Driven Training for Scalable Neural Network Optimization

Hypernetworks have gained attention for their ability to efficiently adapt large models...

admin3 Mins read

OpenAI

Collective Monte Carlo Tree Search (CoMCTS): A New Learning-to-Reason Method for Multimodal Large Language Models

In today’s world, Multimodal large language models (MLLMs) are advanced systems that...

admin3 Mins read

This Week

Tencent Research Introduces DRT-o1: Two Variants DRT-o1-7B and DRT-o1-14B with Breakthrough in Neural Machine Translation for Literary Texts

CLDG: A Simple Machine Learning Framework that Sets New Benchmarks in Unsupervised Learning on Dynamic Graphs

This AI Paper Introduces G-NLL: A Novel Machine Learning Approach for Efficient and Accurate Uncertainty Estimation in Natural Language Generation

Weekly Newsletter

Unveiling Privacy Risks in Machine Unlearning: Reconstruction Attacks on Deleted Data

Leave a comment

Leave a Reply Cancel reply

Latest Posts

CLDG: A Simple Machine Learning Framework that Sets New Benchmarks in Unsupervised Learning on Dynamic Graphs

This AI Paper Introduces G-NLL: A Novel Machine Learning Approach for Efficient and Accurate Uncertainty Estimation in Natural Language Generation

What is AI? The Ultimate Guide for 2025

FineWeb-C: A Community-Built Dataset For Improving Language Models In ALL Languages

aiXplain Introduces a Multi-AI Agent Autonomous Framework for Optimizing Agentic AI Systems Across Diverse Industries and Applications

This AI Paper Explores How Formal Systems Could Revolutionize Math LLMs

Hypernetwork Fields: Efficient Gradient-Driven Training for Scalable Neural Network Optimization

Collective Monte Carlo Tree Search (CoMCTS): A New Learning-to-Reason Method for Multimodal Large Language Models

Get to Know Us

keep in touch