Patent attributes
A system allows the identification and protection of sensitive data in a multiple ways, which can be combined for different workflows, data situations or use cases. The system scans datasets to identify sensitive data or identifying datasets, and to enable the anonymisation of sensitive or identifying datasets by processing that data to produce a safe copy. Furthermore, the system prevents access to a raw dataset. The system enables privacy preserving aggregate queries and computations. The system uses differentially private algorithms to reduce or prevent the risk of identification or disclosure of sensitive information. The system scales to big data and is implemented in a way that supports parallel execution on a distributed compute cluster.

