As a rule, microdata of official statistics are subject to strict confidentiality. However, special provisions of the Federal Statistics Law (BStatG) allow microdata to be passed on for the purpose of statistical analysis if certain requirements are met.
- If the microdata cannot be allocated to the respondent or the person concerned, i.e. conclusions cannot be drawn from the data as to the unit or person providing information (absolute anonymity), these microdata may also be used outside the safe premises of official statistics (Section 16 (1) item 4 BStatG). The data then fulfills the criterion of absolute anonymity.
- For the purpose of scientific research, the Federal Statistics Law permits the provision of microdata if allocating them to the respondents or statistical units ‘requires unreasonable effort in terms of time, cost and manpower’. This presupposes de-facto anonymity of the data. ‘Within specially protected areas of the Federal Statistical Office and the statistical offices of the Länder’ access to formally anonymised data may be granted under certain circumstances. In both cases, data may only be provided to ‘institutions of higher education or other institutions tasked with independent scientific research’. In addition, the persons receiving the data have to be officeholders or persons specifically sworn in (section 16 (6) BStatG).
The Research Data Centres of the Federal Statistical Office and the Statistical Offices of the Federal States make anonymised data available on the basis of these provisions. The various degrees of data anonymity are described below.
Absolutely anonymised data are modified by coarsening or by removing individual variables to a degree that makes an identification of the respondents impossible. Official statistics offer absolutely anonymised microdata in the form of Public Use Files (PUF). These can be made available to all interested persons or institutions. Additionally, there are absolutely anonymous Campus Files for methodological teachings.
Microdata are called de-facto anonymised if deanonymisation cannot be ruled out completely but the allocation of data to the respective statistical unit is only possible with an unreasonable effort in terms of time, cost and manpower ' (section 16 (6) BStatG). Pursuant to the Federal Statistics Law, de-facto anonymised data may only be made available to scientific institutions exclusively for the purpose of scientific projects.
De-facto anonymisation mainly aims at making the correct allocation of the values of a variable to the respective statistical units almost impossible while preserving the statistically relevant informational value. Different anonymization procedures can be applied to achieve this. Common are procedures for the reduction of information (e. g. aggregation, classification) or for the modification of information (e. g. swapping). For the determination of factual anonymity, the cost and benefit of a deanonymisation have to be evaluated.
However, at the Research Data Centres de-facto anonymity is not only a matter of the remaining informational value of the data but also of parameters of a use of data and the concomitant possibilities for a deanonymisation. If a microdata set can be regarded as de-facto anonymised also depends on access conditions. It is of decisive importance what additional knowledge of the statistical units can be drawn upon and where the data are used. Depending on whether the microdata are used externally or at the statistical offices, de-facto anonymity can be achieved with greater (Scientific Use Files) or smaller (safe centres) losses of information.
De-facto anonymised microdata may be used by foreign scientists only on the secure premises of the statistical offices, i.e. in safe centres. Scientific Use Files cannot be sent to foreign countries for legal reasons.
Especially for functionally or regionally deeply structured analyses, the Research Data Centres of official statistics provide data users with the opportunity to analyse formally anonymised microdata within the frame of remote execution or at a safe centre. For the implementation of formal anonymity, the direct identifiers and auxiliary characteristics are deleted from the data set but all other characteristics and the functional and regional structures remain unchanged.