Skip to content

Inconsistent CLI parameters to specify resources requests and limits across training and inference #308

@giuseppeporcelli

Description

@giuseppeporcelli

For training, the CLI parameters are:

  --accelerators INTEGER          Number of accelerators (GPUs/TPUs)
  --vcpu TEXT                     Number of vCPUs
  --memory TEXT                   Amount of memory in GiB
  --accelerators-limit INTEGER    Limit for the number of accelerators (GPUs/TPUs)
  --vcpu-limit TEXT               Limit for the number of vCPUs
  --memory-limit TEXT             Limit for the amount of memory in GiB

For inference (custom endpoints), the CLI parameters are:

  --resources-requests JSON       JSON object of resource requests, e.g. '{"cpu":"1","memory":"2Gi"}'
  --resources-limits JSON         JSON object of resource limits, e.g. '{"cpu":"2","memory":"4Gi"}'

Also, as mentioned in issue 306 (#306), the CLI should offer a consistent way to specify EFA (and Neuron) devices.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions