Assigned
Status Update
Comments
va...@google.com <va...@google.com>
je...@google.com <je...@google.com> #2
Hello,
Thank you for reaching out.
This issue was forwarded to our product team. Please keep in mind that this issue has to be analyzed and considered by the product team and I can't provide you ETA for it to be delivered. However, you can keep track of the status by following this thread.
Kind Regards
Description
What you would like to accomplish:
How this might work:
Currently, the Vertex AI Python SDK initializes the project and location globally using vertexai.init(project=PROJECT_ID, location=LOCATION), meaning all API calls within a session use the same location.
The Vertex AI REST API allows specifying different regions per request without reinitialization by including the location in the endpoint URL.
In the Python SDK, changing the location requires reinitializing the SDK with vertexai.init() before each request. While this provides flexibility in managing different regions, it may introduce additional latency.
If applicable, reasons why alternative solutions are not sufficient:
The REST API offers a direct way to specify the region per request, but the customer prefers to continue using the Python SDK to maintain consistency and avoid significant code refactoring.
The Python SDK does not currently support per-request region selection without reinitialization, making it less efficient for dynamically distributing requests across multiple locations.
Other information (workarounds you have tried, documentation consulted, etc.):
Recommended transitioning to the REST API for per-request region specification. Provided official documentation on Vertex AI locations and quota management: Vertex AI Locations Documentation:https://cloud.google.com/vertex-ai/docs/general/locations
Image Generation using REST API:https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/image-generation
Suggested tracking usage across multiple locations and dynamically distributing requests to prevent exceeding quota limits.
Shared a relevant discussion from Google Cloud Community: Google Cloud Community Discussion on region specification in Python SDK