Background

We are in the process of upgrading to TargetSDK 34 which has the following behavior change for work scheduling:

Since its introduction, JobScheduler expects your app to return from onStartJob or onStopJob within a few seconds. Prior to Android 14, if a job runs too long, it stops and fails silently. If your app targets Android 14 (API level 34) or higher and exceeds the granted time on the main thread, the app triggers an ANR with the error message "No response to onStartJob" or "No response to onStopJob". Consider migrating to WorkManager, which provides support for asynchronous processing or migrating any heavy work into a background thread.

We are observing a significant increase in BG ANRs with the error message "No response to onStopJob"; however, we are already effectively using WorkManager in all applicable scenarios. I can share a doc with more internal details over email highlighting our usages of Jobscheduling which could be potentially problematic.

Issue

Below are some of the top stacktraces as reported through Play Store. We also have separate reports through AppExitInfo which only report "No response to onStopJob" without any stacktraces.

#00  pc 0x0000000000058fbc  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28)
  #01  pc 0x000000000023247c  /apex/com.android.art/lib64/libart.so (art::ConditionVariable::WaitHoldingLocks+140)
  #02  pc 0x000000000045b218  /apex/com.android.art/lib64/libart.so (artJniMethodEnd+336)
  #03  pc 0x00000000005bf0fc  /apex/com.android.art/lib64/libart.so (art_jni_method_end+12)
  at android.os.BinderProxy.transactNative (Native method)
  at android.os.BinderProxy.transact (BinderProxy.java:628)
  at android.app.job.IJobCallback$Stub$Proxy.acknowledgeStopMessage (IJobCallback.java:456)
  at android.app.job.JobServiceEngine$JobHandler.ackStopMessage (JobServiceEngine.java:355)
  at android.app.job.JobServiceEngine$JobHandler.handleMessage (JobServiceEngine.java:180)
  at android.os.Handler.dispatchMessage (Handler.java:108)
  at android.os.Looper.loopOnce (Looper.java:226)
  at android.os.Looper.loop (Looper.java:328)
  at android.app.ActivityThread.main (ActivityThread.java:9229)
  at java.lang.reflect.Method.invoke (Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:586)
  at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1099)

and

  at androidx.work.impl.background.systemjob.SystemJobService.onStopJob (SystemJobService.java:62)
  at android.app.job.JobService$1.onStopJob (JobService.java:107)
  at android.app.job.JobServiceEngine$JobHandler.handleMessage (JobServiceEngine.java:179)
  at android.os.Handler.dispatchMessage (Handler.java:111)
  at android.os.Looper.loopOnce (Looper.java:242)
  at android.os.Looper.loop (Looper.java:362)
  at android.app.ActivityThread.main (ActivityThread.java:8448)
  at java.lang.reflect.Method.invoke (Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:552)
  at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:992)

Additional Info

We had tried rolling out the Target SDK 34 version bump last month, but saw a significant spike in BG ANRs for VIVO devices. Working with VIVO, they were able to identify the issue with a bad interaction with app freezing which has now been fixed.

From the remaining AppExitInfo reports, we see the following which may indicate an issue with app freezer interactions for other OEMs as well.

----- Waiting Channels: pid 26392 at 2024-07-15 14:38:20.900064287-0500 -----
Cmd line: com.snapchat.android

sysTid=26392     do_freezer_trap
sysTid=26401     do_freezer_trap
sysTid=26402     do_freezer_trap
sysTid=26403     do_freezer_trap
sysTid=26404     do_freezer_trap
sysTid=26405     do_freezer_trap
sysTid=26406     do_freezer_trap
sysTid=26407     do_freezer_trap

I've attached a screenshot of the top affected devices. Potentially, we need to reach out to other OEMs to address this problem (e.g. MOTOROLA, GOOGLE, SONY).

As a consequence of this issue, we are seeing that affected devices are having an increase in cold starts instead of warm / hot starts.