Sign in to confirm you’re not a bot
This helps protect our community. Learn more
These chapters are auto-generated

Intro

0:00

'Data Sharing

0:35

Data Release Disasters

2:15

The Promise of Synthetic Data

5:11

Machine Learning as a Service

7:06

What About Generative Models?

10:28

Inference without predictions?

11:26

White-Box Attack

12:34

Black-Box Attack

14:20

White-Box Results

15:35

Black-Box Results

16:23

Defense? Differentially Private GAN?

18:32

Stochastic Gradient Descent (SGD)

19:54

Differential Privacy (Weaker notion)

20:06

Some useful properties for ML

21:31

A Simple Proposal [ICDM'17]

22:01

Experimental Evaluation

23:01

Clustering Accuracy

23:57

Synthetic Samples (MNIST)

24:25

Motivation

25:47

Settings

27:06

Binary Class Size, Precision, Recall

27:51

Multi-Class Size, Precision, and Recall

28:01

Take-Aways

28:05
Privacy and Synthetic Data: The Good, The Bad, and The Ugly
5Likes
482Views
2023Jun 6
To make a dataset more privacy preserving, a generative adversarial model trained with differential privacy generates a synthetic version of the dataset. This synthetic dataset preserves the original dataset’s statistical properties, while minimizing privacy risks. However, the preservation works less well for groups underrepresented in the data. If an AI were to be trained upon the synthetic dataset, it would be unfairly inaccurate. Future generative models should not only reproduce the original data in a privacy-preserving manner but also guarantee fairness across subgroups. Reference links: 1. Robin Hood and Matthew Effects: Differential Privacy Has Disparate Impact on Synthetic Data https://proceedings.mlr.press/v162/ga... 2. Exploiting Unintended Feature Leakage in Collaborative Learning https://arxiv.org/pdf/1805.04049.pdf To find out more, see the Nokia Bell Labs Responsible AI hub: https://www.bell-labs.com/research-in...

Follow along using the transcript.

Nokia Bell Labs

14.5K subscribers