06.10.2022
11:05 - 11:50
Track
Special Topics
Salon 5 - 6
Christian Alexander Graf
Qualitätssicherung und Statistik
Reinventing Test Data - Anonymization, Pseudonymization and Synthetic Data: A State of The Art Review
Test data and test oracles are the two major challenges in testing complex systems. While some solution approaches focus on de-identification when migrating real data to test data, others try to generate specific synthetic data that allow to test the system under test. Advantages for the first set of approaches are that the data is already available and realistic, advantages of the latter methods are the problem-specific data generation and the in any case assured data protection. Disadvantages in the first case are privacy risks, data aging and gaps in the existing data, that have to be closed afterwards. With synthetic data realism, appropriateness and accuracy can be at risk.
The talk will give an overview and insights into the different techniques used in both approaches and how they deal with the mentioned disadvantages. It covers a discussion of privacy risks and their evaluation and classical de-identification methods.
However applied, these may still be vulnerable to specific attacks on privacy. To show how one can deal with these risks, the concepts of differential privacy together with a thorough privacy risk evaluation and mitigation process are introduced.
Differential privacy approaches allow a thorough analysis of privacy risks and the quality of the resulting test data. What is more, state-of-the-art differential privacy allows for the de-identification of statistical data sets and training data for AI systems.
The talk concludes with the discussion of differential privacy in software test settings, so that test data will meet both – the test and privacy requirements.
Christian Alexander Graf, Qualitätssicherung und Statistik
Dipl.-Math. Christian Alexander Graf advises companies on test strategies, data analysis and data security.
He has many years of experience in verification, validation and data analysis from various industrial and scientific fields and lectures in statistics and IT security at cooperative universities.
He is a book author and has written numerous articles on topics related to quality control and quality assurance.