-
Notifications
You must be signed in to change notification settings - Fork 235
Description
Describe the bug
Started training using the command:
python yolo/lazy.py task=train model=v9-s dataset=mock task.epoch=1 cpu_num=0 device=cpu weight=False
Note weight=False. I was trying to do a sanity check by training the model from scratch on a few images.
Ran into the following error:
Traceback (most recent call last):
File "/home/abdul/projects/YOLO/yolo/lazy.py", line 39, in main
solver.solve(dataloader)
File "/home/abdul/projects/YOLO/yolo/tools/solver.py", line 149, in solve
mAPs = self.validator.solve(self.validation_dataloader, epoch_idx=epoch_idx)
File "/home/abdul/projects/YOLO/yolo/tools/solver.py", line 264, in solve
result = calculate_ap(self.coco_gt, predict_json)
File "/home/abdul/projects/YOLO/yolo/utils/solver_utils.py", line 12, in calculate_ap
coco_dt = coco_gt.loadRes(pd_path)
File "/home/abdul/projects/YOLO/.venv/lib/python3.10/site-packages/pycocotools/coco.py", line 329, in loadRes
if 'caption' in anns[0]:
IndexError: list index out of range
Reason:
As the model is being trained from scratch, there were not detections by the model. predictions were empty.
As a result, calculate_ap throws an exception.
Proposed solution:
Please consider adopting MeanAveragePrecision from pytorch lightning. Or consider implementing something like that.
https://lightning.ai/docs/torchmetrics/stable/detection/mean_average_precision.html
The current implementation of calculate_ap is quite buggy. See this MR #79. The bug, stems from the WIP state of calculate_ap.
If calculate_ap was like MeanAveragePrecision, then the training dataloader would never need to return the image_id/image_name. Thus MR #79 would have been necessary too.
Expected behavior
Behavior could be like implementation in PyTorch-Lightning.
https://lightning.ai/docs/torchmetrics/stable/detection/mean_average_precision.html