首页>
外国专利>
Interpretability-Aware Adversarial Attack and Defense Method for Deep Learnings
Interpretability-Aware Adversarial Attack and Defense Method for Deep Learnings
展开▼
机译:深度学习的可解释性感知对抗攻击与防御方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Embodiments relate to a system, program product, and method to support a convolutional neural network (CNN). A class-specific discriminative image region is localized to interpret a prediction of a CNN and to apply a class activation map (CAM) function to received input data. First and second attacks are generated on the CNN with respect to the received input data. The first attack generates first perturbed data and a corresponding first CAM, and the second attack generates second perturbed data and a corresponding second CAM. An interpretability discrepancy is measured to quantify one or more differences between the first CAM and the second CAM. The measured interpretability discrepancy is applied to the CNN. The application is a response to an inconsistency between the first CAM and the second CAM and functions to strengthen the CNN against an adversarial attack.
展开▼