In recent years, synthetic data has shown to be a promising direction for privacy preserving data sharing. To generate synthetic data deep learning models are used. Those models are prone to over-fitting and may memorize training examples, which then may be reproduced during data generation. For example, GitHub’s OpenAI-powered Copilot seems to be able to suggest Ethereum private keys it found in its training data.
Differential Privacy is a privacy guarantee that ensures that participating in a data set does not substantially increase the risk of a data subject’s privacy as result of participating in a data set.
HyperNetworks are a novel type of neural network, which instead of solving a problem directly, predict a neural network to solve a given problem. Link to internal file // Link to internal page // Link to external page // Link to internal page within "Offene Projekte"
In this work you will be combining HyperNetworks and Differential Privacy to predict neural networks for generating synthetic data for a given ϵ.
You task will be to implement a HyperNetwork for generate synthetic data based on different levels of ϵ. You won’t have to start from zero, we already have some source code, you will be extending. We would also like to have a thorough evaluation of the obtained model. If we are successful, we aim to publish the achieved results (paper writing is not part of the thesis and you will get all our support for this).