-
Notifications
You must be signed in to change notification settings - Fork 325
feat(bigquery): Add support for Bigquery reservations in config #5543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This looks good once the build is green |
|
we have been running this in prod for several weeks on our BigQuery deployments and every thing is working as expected. Not super sure how to make CI pass. I have done the pre-commit stuff but still get failures. Any tips of docs for getting me up to speed @izeigerman ? I successfully ran |
|
Hey @scottypate 👋 The errors appear to be related to this test: I think the addition of Can you take a stab at fixing it? You can run the specific test using pytest and then do |
|
@georgesittas that was it! Thank you for getting me onboarded on how to run the tests! I updated the test fixture to account for the reservation id. |
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottypate a few questions to make sure I understand this properly, LGTM otherwise.
| # Set reservation directly on the job_config object if specified | ||
| reservation_id = self._extra_config.get("reservation_id") | ||
| if reservation_id: | ||
| job_config.reservation = reservation_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about three things here:
- Should we be checking
if reservation_id is not Noneinstead of whether it's simply falsey? - How come we're setting a field
reservationwith a property calledreservation_id? Are you sure this is the right attribute in theQueryJobConfiginstance? Seems like it contains a property/getter for accessing thereservation_id. - Is this the recommended way for setting the reservation ID in the config? What about the
QueryJobConfig's constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be checking if reservation_id is not None instead of whether it's simply falsey?
Yes, I made this update to follow the same pattern used for the maximum_bytes_billed property.
How come we're setting a field reservation with a property called reservation_id? Are you sure this is the right attribute in the QueryJobConfig instance? Seems like it [contains a property/getter](https://github.com/googleapis/python-bigquery/blob/46764a59ca7a21ed14ad2c91eb7f98c302736c22/google/cloud/bigquery/job/base.py#L540-L550) for accessing the reservation_id.
Agreed, the better name is reservation instead of reservation_id. I updated this also.
Is this the recommended way for setting the reservation ID in the config? What about the QueryJobConfig's constructor?
I updated the _job_params dictionary, it gets passed to QueryJobConfig(**self._job_params) just like priority, and maximum_bytes_billed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, @scottypate, I dug a bit deeper in bigquery's code myself. I'm not sure if we can rely on _job_params to populate reservation, because the constructor sets attributes, whereas the getter/setter properties look into _properties.
So I wonder if we should just revert to what you did. Have you tested this new code you added end-to-end? Does it work?
Add support for a BigQuery reservation to be specified. This allows users to optionally direct the SQLMesh jobs to a particular BigQuery compute reservation. This is needed for workload management, i.e. you don't want ad-hoc interactive queries blocking data pipelines.