Fixed
Status Update
Comments
ra...@google.com <ra...@google.com> #2
We introduced a regression in beta02. Can you try using beta01 instead ?
ra...@google.com <ra...@google.com> #3
Also, beta03 is coming with a fix very soon.
d....@googlemail.com <d....@googlemail.com> #4
Thank you for your prompt response. I've checked the functioning of OneTimeWorkRequest in beta01 and it works as expected. Looking forward to beta03 for the fix.
ra...@google.com <ra...@google.com> #5
beta03 is out.
d....@googlemail.com <d....@googlemail.com> #6
In beta03, the bug seems fixed when JobScheduler backend is used.
But the bug seems to remain when AlarmManager backend is used.
The following step-by-step fails for me on emulator "Nexus 7 (2012) API 22", but passes on emulator "Pixel 2 XL API 27".
STEP BY STEP:
1. Create a "Hello world" Android app using Android Studio wizard and add implementation("android.arch.work:work-runtime:1.0.0-beta03") to dependencies
2. Add the following to MainActivity.onCreate():
final String name = Long.toString(System.currentTimeMillis()); // To ensure fresh name each run
final OneTimeWorkRequest workRequest = new OneTimeWorkRequest.Builder(LogPrinter.class)
.setBackoffCriteria(BackoffPolicy.LINEAR, WorkRequest.MIN_BACKOFF_MILLIS, TimeUnit.MILLISECONDS)
.addTag(name)
.build();
Log.e("Repro", "Calling enqueueUniqueWork(" + name + ", ...)");
WorkManager.getInstance().enqueueUniqueWork(name, ExistingWorkPolicy.REPLACE, workRequest);
where LogPrinter.class doWork() is:
@NonNull
@Override
public Result doWork() {
Log.e("Repro", "doWork() called. Now swipe away app in App Switcher. Tags: " + getTags());
SystemClock.sleep(10000);
Log.e("Repro", "Too late, try again. Tags: " + getTags());
return Result.success();
}
3. Run the app and check adb logcat -s Repro
4. When you see "Now swipe away app in App Switcher", do that.
EXPECTED:
You should see "doWork() called ..." again after max 10 seconds (due to specified backoff time), since the WorkManager framework should ensure that the Worker that never was finished gets a chance to run again.
ACTUAL:
On API 22, you never see "doWork() called ..." again (Test fails).
On API 27, you do see "doWork() called ..." again (Test pass).
Let me know if you need more input.
Best regards,
Martin Nordholts
But the bug seems to remain when AlarmManager backend is used.
The following step-by-step fails for me on emulator "Nexus 7 (2012) API 22", but passes on emulator "Pixel 2 XL API 27".
STEP BY STEP:
1. Create a "Hello world" Android app using Android Studio wizard and add implementation("android.arch.work:work-runtime:1.0.0-beta03") to dependencies
2. Add the following to MainActivity.onCreate():
final String name = Long.toString(System.currentTimeMillis()); // To ensure fresh name each run
final OneTimeWorkRequest workRequest = new OneTimeWorkRequest.Builder(LogPrinter.class)
.setBackoffCriteria(BackoffPolicy.LINEAR, WorkRequest.MIN_BACKOFF_MILLIS, TimeUnit.MILLISECONDS)
.addTag(name)
.build();
Log.e("Repro", "Calling enqueueUniqueWork(" + name + ", ...)");
WorkManager.getInstance().enqueueUniqueWork(name, ExistingWorkPolicy.REPLACE, workRequest);
where LogPrinter.class doWork() is:
@NonNull
@Override
public Result doWork() {
Log.e("Repro", "doWork() called. Now swipe away app in App Switcher. Tags: " + getTags());
SystemClock.sleep(10000);
Log.e("Repro", "Too late, try again. Tags: " + getTags());
return Result.success();
}
3. Run the app and check adb logcat -s Repro
4. When you see "Now swipe away app in App Switcher", do that.
EXPECTED:
You should see "doWork() called ..." again after max 10 seconds (due to specified backoff time), since the WorkManager framework should ensure that the Worker that never was finished gets a chance to run again.
ACTUAL:
On API 22, you never see "doWork() called ..." again (Test fails).
On API 27, you do see "doWork() called ..." again (Test pass).
Let me know if you need more input.
Best regards,
Martin Nordholts
ra...@google.com <ra...@google.com> #7
When a swipe to dismiss happens you should see a message in logcat which looks like:
W/ActivityManager( 1532): Scheduling restart of crashed service androidx.work.integration.testapp/androidx.work.impl.background.systemalarm.SystemAlarmService in 47470ms
Service crashes are different from Worker retries or backoffs. So you won't see it in 10 seconds. It might take longer. Look at the logs for more info.
W/ActivityManager( 1532): Scheduling restart of crashed service androidx.work.integration.testapp/androidx.work.impl.background.systemalarm.SystemAlarmService in 47470ms
Service crashes are different from Worker retries or backoffs. So you won't see it in 10 seconds. It might take longer. Look at the logs for more info.
ap...@google.com <ap...@google.com> #8
Thanks for the tip, this helped me debug this further.
Turns out that if you see "Scheduling restart of crashed service", then yes, doWork() will be called after the specified time.
But the problem is that SystemAlarmService is not always scheduled to restart, because SystemAlarmService is not always running when the app is closed.
In fact, it seems to be quite rare for SystemAlarmService to run.
If you increase the sleep from 10000 to 20000 in my step-by-step instructions, and let 10 seconds pass between ""doWork() called .., " being logged and you closing the app, I think you'll be able to reproduce the problem, namely that doWork() never becomes called again (since SystemAlarmService is not scheduled for a restart, since it wasn't running when the app was closed).
Are you able to reproduce the problem with these extra instructions?
Turns out that if you see "Scheduling restart of crashed service", then yes, doWork() will be called after the specified time.
But the problem is that SystemAlarmService is not always scheduled to restart, because SystemAlarmService is not always running when the app is closed.
In fact, it seems to be quite rare for SystemAlarmService to run.
If you increase the sleep from 10000 to 20000 in my step-by-step instructions, and let 10 seconds pass between ""doWork() called .., " being logged and you closing the app, I think you'll be able to reproduce the problem, namely that doWork() never becomes called again (since SystemAlarmService is not scheduled for a restart, since it wasn't running when the app was closed).
Are you able to reproduce the problem with these extra instructions?
Description
Version used: 1.0.1
Devices/Android versions reproduced on: different Samsung Galaxy devices running Android 8 and 9
I have received deadlock reports like this:
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 flags=1 obj=0x76790a58 self=0x7d81014c00
| sysTid=27874 nice=0 cgrp=default sched=0/0 handle=0x7e07c1a560
| state=S schedstat=( 1835900780 7285545518 5958 ) utm=82 stm=100 core=1 HZ=100
| stack=0x7febcdd000-0x7febcdf000 stackSize=8MB
| held mutexes=
at androidx.work.impl.constraints.trackers.ConstraintTracker.removeListener (ConstraintTracker.java:77)
- waiting to lock <0x08a4f10d> (a java.lang.Object) held by thread 13 // ConstraintTracker.mLock
at androidx.work.impl.constraints.controllers.ConstraintController.replace (ConstraintController.java:98)
at androidx.work.impl.constraints.WorkConstraintsTracker.replace (WorkConstraintsTracker.java:99)
- locked <0x0cbac8c2> (a java.lang.Object) // WorkConstrainsTracker.mLock
at androidx.work.impl.background.greedy.GreedyScheduler.removeConstraintTrackingFor (GreedyScheduler.java:153)
- locked <0x0cbec6d3> (a java.lang.Object) // GreedyScheduler.mLock
at androidx.work.impl.background.greedy.GreedyScheduler.onExecuted (GreedyScheduler.java:141)
at androidx.work.impl.Processor.onExecuted (Processor.java:230)
- locked <0x08012210> (a java.lang.Object)
at androidx.work.impl.Processor$FutureListener.run (Processor.java:263)
at android.os.Handler.handleCallback (Handler.java:873)
at android.os.Handler.dispatchMessage (Handler.java:99)
at android.os.Looper.loop (Looper.java:214)
at android.app.ActivityThread.main (ActivityThread.java:7032)
at java.lang.reflect.Method.invoke (Method.java)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:494)
at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:965)
"ConnectivityThread" tid=13 Blocked
"ConnectivityThread" prio=5 tid=13 Blocked
| group="main" sCount=1 dsCount=0 flags=1 obj=0x12fc0770 self=0x7d7ab86800
| sysTid=27897 nice=0 cgrp=default sched=0/0 handle=0x7d684114f0
| state=S schedstat=( 247567475 11539483672 1368 ) utm=16 stm=7 core=2 HZ=100
| stack=0x7d6830e000-0x7d68310000 stackSize=1041KB
| held mutexes=
at androidx.work.impl.a.d.c (SourceFile:157) // synchronized (mLock)
- waiting to lock <0x0cbac8c2> (a java.lang.Object) held by thread 1 // trying to lock WorkConstraintsTracker.mLock
at androidx.work.impl.a.a.c.b (SourceFile:133) ConstraintController: mCallback.onConstraintNotMet(mMatchingWorkSpecIds);
at androidx.work.impl.a.a.c.a (SourceFile:142) ConstraintController: updateCallback();
at androidx.work.impl.a.b.d.a (SourceFile:103) listener.onConstraintChanged(mCurrentState);
- locked <0x08a4f10d> (a java.lang.Object) (ConstraintTracker.mLock)
at androidx.work.impl.a.b.e$b.onCapabilitiesChanged (SourceFile:149) (NetworkStateTracker)
at android.net.ConnectivityManager$CallbackHandler.handleMessage (ConnectivityManager.java:3331)
at android.os.Handler.dispatchMessage (Handler.java:106)
at android.os.Looper.loop (Looper.java:214)
at android.os.HandlerThread.run (HandlerThread.java:65)
I have written after the lines what is happening there.
The GreedyScheduler has finished executing and onExecuted is called. This tries to remove the workSpecId from the mConstrainedWorkSpecs list. After removing it it calls mWorkConstraintsTracker.replace to send the new list to the WorkContraintsTracker. This causes a lock of WorkConstraintsTracker.mLock and a call to ConstraintController.replace. This calls ConstraintController.removeListener which tries to lock ConstraintController.mLock but fails/deadlocks. So as you can see it has first locked WorkConstraintsTracker.mLock and then it tries to lock ConstraintController.mLock.
As you can see now the other thread does the opposite, which leads to the deadlock: The WorkManager receives a "onCapabilitiesChanges" message and locks ConstraintTracker.mLock. Then it tries to call onConstraintNotMet but it cannot lock WorkConstraintsTracker.mLock, because it is already locked by the upper thread.
So there is a deadlock because both functions are running at the same time and are locking the locks in opposite order. Can you please fix it? Or did I make a mistake somewhere?
(BTW I am only using WorkManager.getInstance, I am not calling "initialize").
Thank you very much!