Conversation
How did you confirm? I watched memory usage and it seemed fine. I also root caused (by reading internal PL code) why |
|
Note that RobustBench tests are failing because of some model issue unrelated to these changes... |
mzweilin
left a comment
There was a problem hiding this comment.
LGTM.
pytest passes locally. We may need to change the RobustBench test in a separate PR to avoid test failure in CI.
What does this PR do?
This PR merges
*_step_endinto*_stepinLitModular. This means we no longer need to clear outputs.This PR depends upon the following:
LitModular#169Type of change
Please check all relevant options.
Testing
Please describe the tests that you ran to verify your changes. Consider listing any relevant details of your test configuration.
pytestCUDA_VISIBLE_DEVICES=0 python -m mart experiment=CIFAR10_CNN_Adv trainer=gpu trainer.precision=16reports 70% (21 sec/epoch).CUDA_VISIBLE_DEVICES=0,1 python -m mart experiment=CIFAR10_CNN_Adv trainer=ddp trainer.precision=16 trainer.devices=2 model.optimizer.lr=0.2 trainer.max_steps=2925 datamodule.ims_per_batch=256 datamodule.world_size=2reports 70% (14 sec/epoch).Before submitting
pre-commit run -acommand without errorsDid you have fun?
Make sure you had fun coding 🙃