FDZ-Arbeitspapier Nr. 9

Anonymising business micro data - results of a German project

Statistical offices in Germany may provide micro data to the scientific community provided that these data are sufficiently anonymised. Anonymised micro data for scientific purposes have to satisfy two conditions. On the one hand, they must preserve analytical validity to the greatest possible extent, on the other hand, the disclosure risk must be sufficiently low. Anonymising business data is usually seen as a difficult undertaking due to the characteristics of those data. Official Statistics in Germany has carried out a research project on "Factual Anonymisation of Business Micro Data" together with scientific users of its data. In the context of our project, we considered a large variety of anonymisation approaches. A good part of the procedures discussed in the relevant literature were rejected upon a theoretical analysis of their effects on analytical validity. Moreover, we performed test analyses with anonymised real data and compared the results with those obtained from analyses based on original data. Only few approaches remained for further consideration: the traditional ways of information sup-pression, a less detailed presentation of values and categorisation, and the data-modifying methods of micro aggregation and additive or multiplicative noise. Over a period of two years, we applied those measures - in manifold variants and combinations - to real data of statistical offices, and we analysed the results obtained. In this way, we learned a lot about the effects of the anonymisation measures on both analytical validity and the protection of micro data from being disclosed. The project's final report has been issued in summer 2005. In this paper we present the results and recommendations of the project.

Keywords: Statistical data confidentiality, analytical validity of data, anonymisation methods, re-identification risk, scientific use file, disclosure control