FDZ-Arbeitspapier Nr. 37

Masking Micro Data with Stochastic Noise

Stochastic noise is a comparatively new method to anonymise micro data. It is classified as a data perturbation method, as compared to classical anonymisation methods. A new scheme has been developed combining mixture distributions of random noise and the application of the masking method on transformed variables. Adding noise to the logarithmized variables allows preserving correlations in the anonymous data. By combining these two ideas all variables can be masked with a high yet the same relative degree.
The suggested variant of stochastic noise is applied to a data set of all manufacturing firms in Germany. Assessing the quality of the anonymous data set demonstrated the good performance of the anonymisation routine with regards to utility and security even though the parameters are chosen to reach absolute anonymity.