Runpod SDK versions > 1.7.10 send all requests to the same worker even with 10 active workers, and 50 max workers set

**Describe the bug**
Runpod SDK 1.7.12 that attempts to fix a bug in version 1.7.11 does fix the initial bug in local testing, but its still broken in Runpod serverless because its sending every single request to the same workers, leading to massive delay times while the requests are waiting for the worker to become available.

**To Reproduce**
Steps to reproduce the behavior:
1. Create a serverless endpoint that uses Python SDK version 1.7.12.
2. Deploy the endpoint with multiple max workers and some active workers.
3. Send a bunch of concurrent requests.
4. Observe that all requests are being sent to the same worker instead of multiple workers.

**Expected behavior**
If an endpoint is configured to have multiple max workers and active workers, depending on the queue configuration, new requests should be spread across workers and not all sent to the same worker.

**Screenshots**

<img width="1027" alt="Image" src="https://github.com/user-attachments/assets/a4d043ba-b7f6-402d-9871-f053159bcb51" />

**Additional context**

Reverting to SDK version 1.7.10 (since 1.7.11 is also broken) resolves the issue.

<img width="1250" alt="Image" src="https://github.com/user-attachments/assets/aaf671df-4c2d-46a2-b63e-abed3fe1763b" />


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Runpod SDK versions > 1.7.10 send all requests to the same worker even with 10 active workers, and 50 max workers set #432

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Runpod SDK versions > 1.7.10 send all requests to the same worker even with 10 active workers, and 50 max workers set #432

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions