Synthetic data allows studying and sharing of potential sensitive data without the risk of a privacy breach or leak. Synthetic data is generated by a model, which was trained on real data and learns its statistical properties. Thus, generating synthetic data can also be adapted and used for the prediction of missing values or the detection of outliers.
Training of such a model is usually done by using a Generative Adversarial Network (GAN) or a Variational Autoencoder (VAE).
Download as PDF