Supports:
- AdvDet — PyTorch · TensorFlow · PaddlePaddle · MindSpore
- Cognitive Distillation — PyTorch · TensorFlow · PaddlePaddle · MindSpore
- MD Attack — PyTorch · TensorFlow · PaddlePaddle · MindSpore
- PrivDet — PyTorch · TensorFlow · PaddlePaddle · MindSpore
- BlueSuffix - PyTorch
This repository aggregates several research implementations on adversarial attack detection and defense, covering AdvDet, Cognitive Distillation, MD Attack, PrivDet, and the newly added BlueSuffix project. Each submodule offers matched implementations for PyTorch, TensorFlow, PaddlePaddle, and MindSpore so you can reproduce results and run comparisons across different hardware or deployment environments.
- AdvDet (
AdvDet/): Adversarial contrastive prompt tuning for detecting query-based adversarial attacks. Framework entry points: PyTorch · TensorFlow · PaddlePaddle · MindSpore - Cognitive Distillation (
CognitiveDistillation/): Distilling cognitive backdoor patterns to enhance backdoor sample detection. Framework entry points: PyTorch · TensorFlow · PaddlePaddle · MindSpore - MD Attack (
MDAttack/): Investigating imbalanced gradients that cause overestimated robustness with multiple attack/defense pairings. Framework entry points: PyTorch · TensorFlow · PaddlePaddle · MindSpore - PrivDet (
PrivDet/): Private dataset origin detection to differentiate images from distinct distributions (e.g., COCO vs. CIFAR-10). Framework entry points: PyTorch · TensorFlow · PaddlePaddle · MindSpore - BlueSuffix (
BlueSuffix/): A universal defense method against jailbreak attack on Vision-Language Models. Framework entry points: PyTorch