Change theme
Help
Press space for more information.
Show links for this issue (Shortcut: i, l)
Copy issue ID
Previous Issue (Shortcut: k)
Next Issue (Shortcut: j)
Sign in to use full features.
Vote: I am impacted
Notification menu
Refresh (Shortcut: Shift+r)
Go home (Shortcut: u)
Pending code changes (auto-populated)
View issue level access limits(Press Alt + Right arrow for more information)
Request for new functionality
View staffing
Description
spark 2.0 and above supports graceful shudown of the spark streaming application using flag `spark.streaming.stopGracefullyOnShutdown` to true. However this is not honored while terminating spark job using gcloud cli - `gcloud dataproc jobs kill <job_id>`
What you expected to happen:
`gcloud dataproc jobs kill <job_id>` should have an option for graceful shutdown which internally sends SIGTERM to running operating system process associated with spark job.
something similar to following:
Steps to reproduce:
create a spark streaming job
set `spark.streaming.stopGracefullyOnShutdown` to true in spark config
use foreachBatch output sink which sleeps for 5 * trigger_interval seconds.
ingest sample data from any datasource
Kill the job when foreachBatch is sleeping
Result - job gets killed immidiately
Expecte - job should wiat 10 * trigger_interval before getting killed. (10 * trigger_interval is the spark default timeout for graceful shutdown)
Other information (workarounds you have tried, documentation consulted, etc):