Log gain of all examples instead of unsuccessful examples.#201
Log gain of all examples instead of unsuccessful examples.#201
Conversation
dxoigmn
left a comment
There was a problem hiding this comment.
Why is this necessary? Usually you want to log the actual loss you compute gradients of?
The trend could be confusing in the past. While we try to maximize gain, the number is progress bar goes down because it excludes successful examples gradually. |
What is confusing about the trend? That it is possible for the loss to go up? But that should be expected if you understand that the loss is only computed on some examples. Perhaps what you want to do instead is zero out the loss for those examples that are already adversarial or take the sum? You don't get the (potential?) speed up benefit though. I would note that the only thing that changes by doing the first thing is just the normalization constant (i.e., the total number of samples when averaging the loss across samples is fixed instead of changing). |
What does this PR do?
This PR makes
Adversarylog gain of all examples, instead of gain of unsuccessful examples.We should see
gainincreases on progress bar if the attack works.Type of change
Please check all relevant options.
Testing
Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.
pytestCUDA_VISIBLE_DEVICES=0 python -m mart experiment=CIFAR10_CNN_Adv trainer=gpu trainer.precision=16reports 70% (21 sec/epoch).CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 model.optimizer.lr=0.2 trainer.max_steps=2925 datamodule.ims_per_batch=256 datamodule.world_size=2reports 70% (14 sec/epoch).Before submitting
pre-commit run -acommand without errorsDid you have fun?
Make sure you had fun coding 🙃