23 May 2019 at 14:30

Yosi Rinott, The Hebrew University of Jerusalem

CNR IMATI, Via A. Corti, 12, Milano, Aula A

Privacy in data dissemination, differential privacy, and adaptivity

I will demonstrate privacy issues that arise when an agency, for example ISTAT or a hospital,  disseminates data such as a sample from some population or experimental results to the public or to other agencies. Various methods used by statisticians to assess the disclosure risk, and to decrease it will be briefly reviewed (e.g., Dalenius 1977).
     In general, such methods depend on scenarios regarding potential intruders, such as the intruder’s prior knowledge about the sample or the population.  Differential Privacy (Dwork, McSherry, Nissim and Smith2006) is an approach that avoids the need to consider such scenarios, and guarantees a well-defined notion of privacy by adding noise to all released data. I will describe some basic results on differential privacy with some discussion of its application to the release of contingency tables (Dwork and Roth 2014, Rinott, O’Keefe, Shlomo, and Skinner 2018). Ongoing work on data analysis that takes the added noise into account will be discussed.
     An attempt to use ideas from differential privacy to control adaptive hypotheses testing or estimation will be described. Here, “adaptive” means  choosing the hypotheses to be tested after peeking (snooping) at the data, which is a major problem in data mining, and probably explains in part why so many contradictory finding appear in medical  journals.