Graduate Thesis Or Dissertation
 

Poisoning machine learning models to increase membership inference risks

Public Deposited

Contenu téléchargeable

Télécharger le fichier PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/g732dj09d

Descriptions

Attribute NameValues
Creator
Abstract
  • Deep learning is now being utilized widely in applications where sensitive data is being used for model training, for example, in health care. In this scenario, any data leakage will cause privacy concerns to whose data records are used to train the model. An attacker can actively cause privacy leakage via membership inference (MI) attack, which aims to infer if a data point belongs to the training set of the deep learning model. In the past, MI has been studied solely within the class of privacy attacks without the assumption that the attacker can violate the integrity of the training procedure. However, this overlook poses a significant problem as in practice dataset can come from the Internet, for instance, web-crawled dataset. In this thesis, we study an adversary who aims to violate the integrity of the training data to increase the success of MI. We first introduce a new threat model that combines data poisoning and membership inference attack. We propose the two concrete attack scenarios. (1) In targeted attacks, the attacker aims to increase the MI success rate on a specific target sample by injecting poisoning samples into the training data. (2) In untargeted attacks, the adversary aims to increase the MI’s success over the entire training set by poisoning. In evaluation with standard visual recognition benchmarks, the targeted attacker can increase the MI success by 8x times at a low false-positive rate (e.g., 0.1%). In the untargeted attack setting, if the adversary compromises half of training data, the MI success over the entire training samples increases by 2x times at the same low false-positive rate. Our analysis shows one of the cause for the amplification is that the poison increases the vulnerability of seemingly safe data points. Lastly, we discuss a mitigation mechanism against MI attack that aims to be fast and ad-hoc. While future work is needed to fully develop this defense mechanism, preliminary results show some promising signals of this approach: with only 2% reduction in the test accuracy, the TPR at FPR of 0.1% is half.
License
Resource Type
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Déclaration de droits
Publisher
Peer Reviewed
Language

Des relations

Parents:

This work has no parents.

Dans Collection:

Articles