App Engine Flex Scaling issues [330416823]

Assigned

Feature Request

Status Update

No update yet.

Description

dh...@google.com

created issue #1

Mar 20, 2024 10:48AM

This will create a feature request which anybody can view and comment on.

What you would like to accomplish:

App Engine Flex scaling issues

Scaling automatically and process all the requests in the App Engine Flex even when the CPU utilization goes high.

Comments

dh...@google.com <dh...@google.com> Mar 20, 2024 10:49AM

Assigned to gc...@google.com.

sa...@vaultedge.com <sa...@vaultedge.com> #2Mar 21, 2024 02:09PM

I will try to explain the problem we are facing.

We use the App Engine Flexible environment which uses auto_scaling. It scales based on CPU utilization. When the CPU utilization falls below the provided threshold, it starts to scale down the pods. The application receives a SIGTERM signal followed by a 3 second wait time after which the pod is forcefully brought down. It is possible that the pod which is scaled down is still running a request. This 3 second wait time is not enough to complete the running request. So we end up failing the request.

Why are we not able to requeue the request during the 3-second wait time?
This is because the request starts with a MFA from the user. To requeue the request, we will have to go back to the user to get the MFA. As an API provider this is not within our purview.

Ask:
The requirement here is to have a parameter similar to "Idle Timeout" which is available for Basic Scaling. The hope is that this will help us in slowing down the scaling down process so that we can reduce the number of requests which are getting terminated.

Issue 330416823

Description

Issue summary

Comments

dh...@google.com <dh...@google.com> Mar 20, 2024 10:49AM

sa...@vaultedge.com <sa...@vaultedge.com> #2Mar 21, 2024 02:09PM

Add comment

Issue metadata