Home MarkTechPost Unlocking High-Accuracy Differentially Private Image Classification through Scale

MarkTechPost

Unlocking High-Accuracy Differentially Private Image Classification through Scale

adminUpdated 12 months Ago3 Mins read79 Views

Unlocking High-Accuracy Differentially Private Image Classification through Scale

Research

Published: 17 June 2022
Authors: Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle

A recent DeepMind paper on the ethical and social risks of language models identified large language models leaking sensitive information about their training data as a potential risk that organisations working on these models have the responsibility to address. Another recent paper shows that similar privacy risks can also arise in standard image classification models: a fingerprint of each individual training image can be found embedded in the model parameters, and malicious parties could exploit such fingerprints to reconstruct the training data from the model.

Privacy-enhancing technologies like differential privacy (DP) can be deployed at training time to mitigate these risks, but they often incur significant reduction in model performance. In this work, we make substantial progress towards unlocking high-accuracy training of image classification models under differential privacy.

Figure 1: (left) Illustration of training data leakage in GPT-2 [credit: Carlini et al. “Extracting Training Data from Large Language Models”, 2021]. (right) CIFAR-10 training examples reconstructed from a 100K parameter convolutional neural network [credit: Balle et al. “Reconstructing Training Data with Informed Adversaries”, 2022]

Differential privacy was proposed as a mathematical framework to capture the requirement of protecting individual records in the course of statistical data analysis (including the training of machine learning models). DP algorithms protect individuals from any inferences about the features that make them unique (including complete or partial reconstruction) by injecting carefully calibrated noise during the computation of the desired statistic or model. Using DP algorithms provides robust and rigorous privacy guarantees both in theory and in practice, and has become a de-facto gold standard adopted by a number of public and private organisations.

The most popular DP algorithm for deep learning is differentially private stochastic gradient descent (DP-SGD), a modification of standard SGD obtained by clipping gradients of individual examples and adding enough noise to mask the contribution of any individual to each model update:

Figure 2: Illustration of how DP-SGD processes gradients of individual examples and adds noise to produce model updates with privatised gradients.

Unfortunately, prior works have found that in practice, the privacy protection provided by DP-SGD often comes at the cost of significantly less accurate models, which presents a major obstacle to the widespread adoption of differential privacy in the machine learning community. According to empirical evidence from prior works, this utility degradation in DP-SGD becomes more severe on larger neural network models – including the ones regularly used to achieve the best performance on challenging image classification benchmarks.

Our work investigates this phenomenon and proposes a series of simple modifications to both the training procedure and model architecture, yielding a significant improvement on the accuracy of DP training on standard image classification benchmarks. The most striking observation coming out of our research is that DP-SGD can be used to efficiently train much deeper models than previously thought, as long as one ensures the model’s gradients are well-behaved. We believe the substantial jump in performance achieved by our research has the potential to unlock practical applications of image classification models trained with formal privacy guarantees.

The figure below summarises two of our main results: an ~10% improvement on CIFAR-10 compared to previous work when privately training without additional data, and a top-1 accuracy of 86.7% on ImageNet when privately fine-tuning a model pre-trained on a different dataset, almost closing the gap with the best non-private performance.

Figure 3: (left) Our best results on training WideResNet models on CIFAR-10 without additional data. (right) Our best results on fine-tuning NFNet models on ImageNet. The best performing model was pre-trained on an internal dataset disjoint from ImageNet.

These results are achieved at ε=8, a standard setting for calibrating the strength of the protection offered by differential privacy in machine learning applications. We refer to the paper for a discussion of this parameter, as well as additional experimental results at other values of ε and also on other datasets. Together with the paper, we are also open-sourcing our implementation to enable other researchers to verify our findings and build on them. We hope this contribution will help others interested in making practical DP training a reality.

Download our JAX implementation on GitHub.

Source link

Previous post BYOL-Explore: Exploration with Bootstrapped Prediction

Next post Bridging DeepMind research with Alphabet products

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

The last few months have been an exciting time for the Gemma...

admin3 Mins read

MarkTechPost

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

The last few months have been an exciting time for the Gemma...

admin3 Mins read

MarkTechPost

How AI is helping advance the science of bioacoustics to save endangered species

Science Published 7 August 2025 Authors The Perch Team Our new Perch...

admin3 Mins read

MarkTechPost

Genie 3: A new frontier for world models

Acknowledgments Genie 3 was made possible due to key research and engineering...

admin1 Mins read

This Week

AmbiGraph-Eval: A Benchmark for Resolving Ambiguity in Graph Query Generation

Huawei CloudMatrix: A Peer-to-Peer AI Datacenter Architecture for Scalable and Efficient LLM Serving

Native RAG vs. Agentic RAG: Which Approach Advances Enterprise AI Decision-Making?

Weekly Newsletter

Unlocking High-Accuracy Differentially Private Image Classification through Scale

Leave a comment

Leave a Reply Cancel reply

Latest Posts

Huawei CloudMatrix: A Peer-to-Peer AI Datacenter Architecture for Scalable and Efficient LLM Serving

Native RAG vs. Agentic RAG: Which Approach Advances Enterprise AI Decision-Making?

Tried an AI Text Humanizer That Passes Copyscape Checker

From Virtual Salespeople in China to Dev Teams Betting on the End of Marketing Departments

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

How AI is helping advance the science of bioacoustics to save endangered species

Genie 3: A new frontier for world models

Get to Know Us

keep in touch