Empirical privacy metrics: The bad, the ugly… and the good, maybe?

In a talk for PEPR ‘24, Damien Desfontaines lists major issues with empirical privacy metrics for synthetic data generation, and explains how we could fix them.

Video

Synthetic data generation makes for a convincing pitch: create fake data that follows the same statistical distribution as your real data, so you can analyze it, share it, and sell it. Supposedly, privacy and compliance are achieved because this synthetic data is anonymous.

How do synthetic data vendors justify such privacy claims? Their answer often boils down to empirical privacy metrics. Vendors recommend that users run measurements on their synthetic data and empirically determine whether it's safe enough to release. But how do these metrics work? How useful are they? And how much should you rely on them?

In a talk delivered to PEPR '24, Damien Desfontaines takes a critical look at the space of synthetic data generation and empirical privacy metrics, dispels some some marketing-fueled myths that are a little too good to be true, and explains what is needed for these tools to be a valuable part of a larger privacy posture.

You can watch the recording of the presentation below, or directly read its transcript.

At Tumult Labs, we’re building synthetic data generation solutions that provide robust privacy guarantees using the proven science of differential privacy. We’re also designing ways to perform empirical privacy evaluation that avoid the pitfalls described in this presentation. If you’d like to learn more or schedule a demo, let us know! We’d love to hear from you.

Read paper

other Video articles

View All

Green outline diamond on dark blue background

Video

How we can save anonymization

In a talk for PEPR ‘24, Daniel Simmons-Marengo explains why anonymization is at risk, and what we can do to safeguard user trust going forward.

Video

Sharing insights without leaking personal information

Damien Desfontaines describes how differential privacy can bring together open data with safe, privacy-preserving publication practices.

Video

A short tour of Tumult Analytics

Watch a short demo video outlining the main features of Tumult Analytics, and demonstrating the ease-of-use of its interface.

Video

Tumult Tune: 5-minute teaser demo

Watch a teaser demo of Tumult Tune, an upcoming product that allows you to easily understand and optimize differentially private data products.

green hexagon lines on a dark blue background

Video