Skip to content

Runpod SDK versions > 1.7.10 send all requests to the same worker even with 10 active workers, and 50 max workers set #432

@ashleykleynhans

Description

@ashleykleynhans

Describe the bug
Runpod SDK 1.7.12 that attempts to fix a bug in version 1.7.11 does fix the initial bug in local testing, but its still broken in Runpod serverless because its sending every single request to the same workers, leading to massive delay times while the requests are waiting for the worker to become available.

To Reproduce
Steps to reproduce the behavior:

  1. Create a serverless endpoint that uses Python SDK version 1.7.12.
  2. Deploy the endpoint with multiple max workers and some active workers.
  3. Send a bunch of concurrent requests.
  4. Observe that all requests are being sent to the same worker instead of multiple workers.

Expected behavior
If an endpoint is configured to have multiple max workers and active workers, depending on the queue configuration, new requests should be spread across workers and not all sent to the same worker.

Screenshots

Image

Additional context

Reverting to SDK version 1.7.10 (since 1.7.11 is also broken) resolves the issue.

Image

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions