Indiscriminate data poisoning attacks are quite effective against supervised learning. However, not much is known about their impact on unsupervised contrastive learning (CL). This paper is the first to consider indiscriminate poisoning attacks of contrastive learning. We propose Contrastive Poisoning (CP), the first effective such attack on CL. We empirically show that Contrastive Poisoning, not only drastically reduces the performance of CL algorithms, but also attacks supervised learning models, making it the most generalizable indiscriminate poisoning attack. We also show that CL algorithms with a momentum encoder are more robust to indiscriminate poisoning.
Previous studies have shown that indiscriminate data poisoning methods can be highly effective in attacking supervised learning models, with the ability to reduce CIFAR-10 accuracy from 95% to 6%. However, these methods are inherently fragile and can be defended against by unsupervised learning techniques like contrastive learning, which can restore CIFAR-10 accuracy to 80% from 6%. As such, our paper aims to address the problem of poisoning contrastive learning models, exploring new methods for compromising the integrity of these models in the face of advanced defense mechanisms.
Our idea is to learn poisoning perturbation that can shortcut the contrastive learning. Specifically, we co-learn the poisoning perturbation together with a neural network model to minimize the contrastive learning loss.
We emphasize two key technical points that are important to the effectiveness of the learned contrastive poisoning:
@inproceedings{he2023indiscriminate,
title={Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning},
author={Hao He and Kaiwen Zha and Dina Katabi},
booktitle={The Eleventh International Conference on Learning Representations},
year={2023},
url={https://openreview.net/forum?id=f0a_dWEYg-Td}
}