Assigned
Status Update
Comments
cl...@rakuten.com <cl...@rakuten.com> #2
Apologies, I intended to create a feature request not a bug. I will create a feature request instead.
je...@google.com <je...@google.com>
je...@google.com <je...@google.com> #3
Hello,
This issue report has been forwarded to the Cloud Dataproc Product team so that they may investigate it, but there is no ETA for a resolution today. Future updates regarding this issue will be provided here.
Description
Currently when using managed clusters with workflow templates, number of workers (primary or secondary) are parameterizable[1] but autoscalling policy URI is not:
INVALID_ARGUMENT: Invalid field path placement.managed_cluster.configuration.autoscaling_config.policy_uri: Field policy_uri does not exist.
This makes the parameters of number of instances unusable when the cluster has autoscalling, as the number of workers in autoscalling policy need to be updated as well (especially for primary workers that have a fixed number).
One use case that we have is updating the number of instances from airflow, based on some airflow variable, or specific criteria like pipeline delay. To workaround this we currently have to duplicate templates per job for each resources setting, with added maintenance efforts.
[1]
What you expected to happen:
Being able to pass placement.managed_cluster.configuration.autoscaling_config.policy_uri as parameters in workflow template to factorize templates and set cluster resources from airflow dag.
Steps to reproduce:
- Create a workflow template with autoscalling policy URI parameter and import it.
Other information (workarounds you have tried, documentation consulted, etc):
Workaround is to duplicate workflow templates for each resource setting.
Documentation: