Paper ID: 2311.11717
Can we infer the presence of Differential Privacy in Deep Learning models' weights? Towards more secure Deep Learning
Jiménez-López, Daniel, Rodríguez-Barroso, Nuria, Luzón, M. Victoria, Herrera, Francisco
Differential Privacy (DP) is a key property to protect data and models from integrity attacks. In the Deep Learning (DL) field, it is commonly implemented through the Differentially Private Stochastic Gradient Descent (DP-SGD). However, when a model is shared or released, there is no way to check whether it is differentially private, that is, it required to trust the model provider. This situation poses a problem when data privacy is mandatory, specially with current data regulations, as the presence of DP can not be certificated consistently by any third party. Thus, we face the challenge of determining whether a DL model has been trained with DP, according to the title question: Can we infer the presence of Differential Privacy in Deep Learning models' weights? Since the DP-SGD significantly changes the training process of a DL model, we hypothesize that DP leaves an imprint in the weights of a DL model, which can be used to predict whether a model has been trained with DP regardless of its architecture and the training dataset. In this paper, we propose to employ the imprint in model weights of using DP to infer the presence of DP training in a DL model. To substantiate our hypothesis, we developed an experimental methodology based on two datasets of weights of DL models, each with models with and without DP training and a meta-classifier to infer whether DP was used in the training process of a DL model, by accessing its weights. We accomplish both, the removal of the requirement of a trusted model provider and a strong foundation for this interesting line of research. Thus, our contribution is an additional layer of security on top of the strict private requirements of DP training in DL models, towards to DL models.
Submitted: Nov 20, 2023