Skip to content

Attempt at transfer training in Colab. Runs briefly and quits. #200

@johnk2hawaii

Description

@johnk2hawaii

Describe the bug

I tried to transfer train on a simple custom dataset for testing out the model in colab. It runs briefly then quits. Note that I git cloned the project into colab. In an attempt to transfer train I get the follwing results.

using:
!python yolo/lazy.py task=train task.data.batch_size=10 model=v9-s dataset=coco device=cuda task.epoch=1 use_wandb=False

output.log shows:

📈 Enable Model EMA
🚜 Building YOLO
🏗️ Building backbone
🏗️ Building neck
🏗️ Building head
🏗️ Building detection
🏗️ Building auxiliary
🌐 Weight weights/v9-s.pt not found, try downloading
✅ Download completed.
✅ Success load model & weight
✅ Download completed.
Unzipping val2017.zip...
Removed data/coco/val2017.zip.
✅ Download completed.
Unzipping annotations_trainval2017.zip...
Removed data/coco/annotations_trainval2017.zip.
🏭 Generating val2017 cache
No valid BBox in 000000025593
No valid BBox in 000000041488
No valid BBox in 000000042888
No valid BBox in 000000049091
No valid BBox in 000000058636
No valid BBox in 000000064574
No valid BBox in 000000098497
No valid BBox in 000000101022
.
.
.
No valid BBox in 000000556498
No valid BBox in 000000560371
Recorded 5000/5000 valid inputs
✅ Download completed.
Unzipping train2017.zip...
Removed data/coco/train2017.zip.
✅ Dataset annotations already verified.
🏭 Generating train2017 cache
No valid BBox in 000000000250
No valid BBox in 000000000508
No valid BBox in 000000001111
.
.
.
No valid BBox in 000000208708
No valid BBox in 000000579023
No valid BBox in 000000579247
No valid BBox in 000000581087
Recorded 118287/118287 valid inputs
🧸 Found no stride of model, performed a dummy test for auto-anchor size
✅ Success load loss function

System Info (please complete the following ## information):

  • Colab
  • T4
  • running !nvidia-smi shows Cuda version 12.4

Additional context

No luck with other pretrained models. No luck on a custom dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions