Options to customize the autoscaling algorithm for reduced latency and consumption [124105401]

Assigned

Feature Request

Status Update

No update yet.

Description

he...@google.com

created issue #1

Feb 8, 2019 05:11PM

Please describe your requested enhancement. Good feature requests will solve common problems or enable new use cases.

What you would like to accomplish:

In order to reduce the latency of online predictions, customer considers necessary implementing some customization options to the autoscaling algorithm

How this might work:

• Option to specify the idle time: Currently, the documentation states that the service scales down to zero after several minutes without a prediction request. Customer wants to specify how many minutes they want to keep the workers up before it scales down.

• Option to have both a minimum node count and the scaling down to zero: Currently, the minNodes option disables the scaling down to zero, customer wants that when the service scaled down to zero and then it gets a new request, the service pushes up a minimum number of nodes.

• Options for aggressively adding more nodes: For example, customer wants to specify that if the utilization is above 50%, then add more nodes.

If applicable, reasons why alternative solutions are not sufficient:

Low latency is business critical for the customer, but they want to avoid unused resources as well for when their customers are not likely to submit requests, so those are some options the customer wants in order to guarantee a low latency for the most of the time.

Other information (workarounds you have tried, documentation consulted, etc):

https://cloud.google.com/ml-engine/docs/tensorflow/prediction-overview

https://cloud.google.com/ml-engine/docs/tensorflow/online-predict

Comments

mi...@google.com <mi...@google.com> Feb 8, 2019 05:36PM

Assigned to gc...@google.com.

at...@google.com <at...@google.com> #2Mar 7, 2019 08:22PM

The Cloud ML Engine engineering team were notified about the interest of having this feature implemented.

Kindly note there are no ETAs or guarantees for this feature to be available. Future updates will be shared on this thread.

Issue 124105401

Description

Issue summary

Comments

mi...@google.com <mi...@google.com> Feb 8, 2019 05:36PM

at...@google.com <at...@google.com> #2Mar 7, 2019 08:22PM

Add comment

Issue metadata