dg.o2005 Tutorial: "Data Confidentiality and Statistical Disclosure Limitation"

Presenter: Alan F. Karr, Director, National Institute of Statistical Sciences (NISS)

Scheduled: 1:30 p.m., Wednesday, May 18 | REGISTER NOW

Description:
Federal statistical agencies must fulfill two nearly contradictory missions. On the one hand, they must extract and disseminate to other agencies, the research community and the public useful information derived from sample surveys and censuses. But, they must also protect the confidentiality of the data and the privacy of data subjects. Protecting confidentiality may be mandated by law, prescribed by agency practices or promised to respondents. Often, confidentiality must be preserved in order to ensure the quality of the data: respondents do not answer truthfully if they believe that their privacy is threatened.

The tutorial is an overview of methods known collectively as statistical disclosure limitation (SDL) that attempt to resolve this contradiction.

Goal:
The tutorial will introduce participants to fundamental problems and methods of SDL, the latter ranging from limiting access to data, to altering data prior to release, to releasing only the results of "safe" statistical analyses of the data.

In particular, the development of computing and statistical technologies and the emergence of the Internet as the principal mode for disseminating federal data both exacerbate the problems and offer new kinds of solutions. The tutorial will describe the problems, especially record linkage to external databases, as well as solutions such as analysis servers that account for the interactions among multiple queries on the same database.

No deep prior knowledge of data confidentiality, statistics or computer science will be assumed.

Outline:
This tutorial will focus on essential aspects of data confidentiality and SDL in the electronic world:

The tutorial will conclude by addressing the increasingly important problem, arising not only in "traditional" settings but also in the context of homeland security and for proprietary corporate data, of safely conducting informative statistical analyses on distributed databases whose owners cannot or will not allow the data to be integrated.

Biographical information:
Alan F. Karr is Director of the National Institute of Statistical Sciences (NISS), a position he has held since 2000; prior to that he was Associate Director (1992-2000). He is also Professor of Statistics/Operations Research and Biostatistics at the University of North Carolina at Chapel Hill (since 1993), as well as Associate Director of the Statistical and Applied Mathematical Sciences Institute (SAMSI).

His research activities are cross-disciplinary collaborations involving statistics and such other fields as data confidentiality, data integration, data quality, education statistics, software engineering, information technology, transportation, materials science and E-commerce. He is the author of three books and nearly 100 scientific papers, a fellow of the American Statistical Association and the Institute of Mathematical Statistics, a member of the Council of the latter and the Board of Governors of the Interface Foundation of North America, and served as a member of the Army Science Board from 1990 to 1996.

Contact information:
Alan F. Karr, National Institute of Statistical Sciences, PO Box 14006, Research Triangle Park, NC 27709-4006; Tel: 919-685-9300; FAX: 919-685-9310; E-mail: karr@niss.org