Fixed
Status Update
Comments
ru...@gmail.com <ru...@gmail.com> #2
Yigit, do you have time to fix it?
reemission of the same liveData is racy
reemission of the same liveData is racy
ru...@gmail.com <ru...@gmail.com> #4
Thanks for the detailed analysis. This may not be an issue anymore since we've started using Main.immediate there but I' not sure; I'll try to create a test case.
ru...@gmail.com <ru...@gmail.com> #5
just emitting same live data reproduces the issue.
@Test
fun raceTest() {
val subLiveData = MutableLiveData(1)
val subject = liveData(testScope.coroutineContext) {
emitSource(subLiveData)
emitSource(subLiveData) //crashes
}
subject.addObserver().apply {
testScope.advanceUntilIdle()
}
}
@Test
fun raceTest() {
val subLiveData = MutableLiveData(1)
val subject = liveData(testScope.coroutineContext) {
emitSource(subLiveData)
emitSource(subLiveData) //crashes
}
subject.addObserver().apply {
testScope.advanceUntilIdle()
}
}
su...@google.com <su...@google.com> #6
With 2.2.0-alpha04 (that use Main.immediate), the issue seems to be still there (I tested it by calling emitSource() twice, like your test case)
ru...@gmail.com <ru...@gmail.com> #7
yea sorry immediate does not fix it.
I actually have a WIP fix for it:
https://android-review.googlesource.com/c/platform/frameworks/support/+/1112186
if your case is the one i found (emitting same LiveData multiple times, as shown in #5) you can work around it by adding a dummy transformation.
val subLiveData = MutableLiveData(1)
val subject = liveData(testScope.coroutineContext) {
emitSource(subLiveData.map {it })
emitSource(subLiveData.map {it} )
}
I actually have a WIP fix for it:
if your case is the one i found (emitting same LiveData multiple times, as shown in #5) you can work around it by adding a dummy transformation.
val subLiveData = MutableLiveData(1)
val subject = liveData(testScope.coroutineContext) {
emitSource(subLiveData.map {it })
emitSource(subLiveData.map {it} )
}
ru...@gmail.com <ru...@gmail.com> #8
Project: platform/frameworks/support
Branch: androidx-master-dev
commit af12e75e6b4110f48e44ca121466943909de8f06
Author: Yigit Boyar <yboyar@google.com>
Date: Tue Sep 03 12:58:11 2019
Fix coroutine livedata race condition
This CL fixes a bug in liveData builder where emitting same
LiveData source twice would make it crash because the second
emission registry could possibly happen before first one is
removed as source.
We fix it by using a suspending dispose function. It does feel
a bit hacky but we cannot make DisposableHandle.dispose async
and we do not want to block there. This does not mean that there
is a problem if developer disposes it manually since our emit
functions take care of making sure it disposes (and there is
no other way to add source to the underlying MediatorLiveData)
Bug: 140249349
Test: BuildLiveDataTest#raceTest_*
Change-Id: I0b464c242a583da4669af195cf2504e2adc4de40
M lifecycle/lifecycle-livedata-ktx/api/2.2.0-alpha05.txt
M lifecycle/lifecycle-livedata-ktx/api/current.txt
M lifecycle/lifecycle-livedata-ktx/api/public_plus_experimental_2.2.0-alpha05.txt
M lifecycle/lifecycle-livedata-ktx/api/public_plus_experimental_current.txt
M lifecycle/lifecycle-livedata-ktx/api/restricted_2.2.0-alpha05.txt
M lifecycle/lifecycle-livedata-ktx/api/restricted_current.txt
M lifecycle/lifecycle-livedata-ktx/src/main/java/androidx/lifecycle/CoroutineLiveData.kt
M lifecycle/lifecycle-livedata-ktx/src/test/java/androidx/lifecycle/BuildLiveDataTest.kt
https://android-review.googlesource.com/1112186
https://goto.google.com/android-sha1/af12e75e6b4110f48e44ca121466943909de8f06
Branch: androidx-master-dev
commit af12e75e6b4110f48e44ca121466943909de8f06
Author: Yigit Boyar <yboyar@google.com>
Date: Tue Sep 03 12:58:11 2019
Fix coroutine livedata race condition
This CL fixes a bug in liveData builder where emitting same
LiveData source twice would make it crash because the second
emission registry could possibly happen before first one is
removed as source.
We fix it by using a suspending dispose function. It does feel
a bit hacky but we cannot make DisposableHandle.dispose async
and we do not want to block there. This does not mean that there
is a problem if developer disposes it manually since our emit
functions take care of making sure it disposes (and there is
no other way to add source to the underlying MediatorLiveData)
Bug: 140249349
Test: BuildLiveDataTest#raceTest_*
Change-Id: I0b464c242a583da4669af195cf2504e2adc4de40
M lifecycle/lifecycle-livedata-ktx/api/2.2.0-alpha05.txt
M lifecycle/lifecycle-livedata-ktx/api/current.txt
M lifecycle/lifecycle-livedata-ktx/api/public_plus_experimental_2.2.0-alpha05.txt
M lifecycle/lifecycle-livedata-ktx/api/public_plus_experimental_current.txt
M lifecycle/lifecycle-livedata-ktx/api/restricted_2.2.0-alpha05.txt
M lifecycle/lifecycle-livedata-ktx/api/restricted_current.txt
M lifecycle/lifecycle-livedata-ktx/src/main/java/androidx/lifecycle/CoroutineLiveData.kt
M lifecycle/lifecycle-livedata-ktx/src/test/java/androidx/lifecycle/BuildLiveDataTest.kt
ra...@google.com <ra...@google.com> #9
"And shouldn't the behaviour be consistent across all Android versions?"
The behavior is actually consistent. Once your worker starts running, if we get a call to enqueueUniquePeriodicWork a new work request and with ExistingPeriodicWorkPolicy.REPLACE we will stop existing work.
If your Worker was not running, then there is nothing to cancel. This is just a matter of timing.
The behavior is actually consistent. Once your worker starts running, if we get a call to enqueueUniquePeriodicWork a new work request and with ExistingPeriodicWorkPolicy.REPLACE we will stop existing work.
If your Worker was not running, then there is nothing to cancel. This is just a matter of timing.
ru...@gmail.com <ru...@gmail.com> #10
I'm not arguing against the expected behavior of ExistingPeriodicWorkPolicy.REPLACE, since that's working as intended. But this is happening only when setRequiredNetworkType(NetworkType.UNMETERED) is used while having ExistingPeriodicWorkPolicy.KEEP.
ra...@google.com <ra...@google.com> #11
Can you send us an updated sample ?
ra...@google.com <ra...@google.com> #12
Looks like you have updated the sample, let me take another look.
ru...@gmail.com <ru...@gmail.com> #13
Yes, I've updated the sample on github. Thanks for taking another look.
ra...@google.com <ra...@google.com> #14
I looked at your code. Looks like the constraints are being unmet after the work starts in some cases.
So essentially, your worker starts, and mid-way the constraints are unmet. At this point because your Worker may not recover gracefully, we cancel and reschedule work. This happens for all your workers actually (if you turn on verbose logging you can see that).
This too is expected.
So essentially, your worker starts, and mid-way the constraints are unmet. At this point because your Worker may not recover gracefully, we cancel and reschedule work. This happens for all your workers actually (if you turn on verbose logging you can see that).
This too is expected.
to...@gmail.com <to...@gmail.com> #15
We have exactly the same issue in a major app - so that even SDK versions < 23 make up a six figure number of installs. We use WorkManager to synchronize data from an api - this should only happen every couple of minutes and due to this issue, it leads to multiple api calls per seconds on those older Android versions, generating an "expensive" load on our cloud-based infrastructure.
su...@google.com <su...@google.com> #16
Re: #13/14: We're looking at it some more.
#15: File a separate bug, with sample code or a sample app. The issue is vague enough that there could be something completely different happening.
#15: File a separate bug, with sample code or a sample app. The issue is vague enough that there could be something completely different happening.
ru...@gmail.com <ru...@gmail.com> #17
#14: Even with verbose logging selected, I only see TestWorkerA being cancelled. The other workers are still successful.
ra...@google.com <ra...@google.com>
to...@gmail.com <to...@gmail.com> #18
I can confirm the root cause is exactly the same here as in our app:
java.util.concurrent.CancellationException: Task was cancelled.
at androidx.work.impl.utils.futures.AbstractFuture.cancellationExceptionWithCause(AbstractFuture.java:1184)
at androidx.work.impl.utils.futures.AbstractFuture.getDoneValue(AbstractFuture.java:514)
at androidx.work.impl.utils.futures.AbstractFuture.get(AbstractFuture.java:475)
at androidx.work.impl.WorkerWrapper$2.run(WorkerWrapper.java:245)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:841)
In our case:
fun scheduleHighlightsSync() {
val request = PeriodicWorkRequest.Builder(SyncHighlightsWorker::class.java, 1, TimeUnit.HOURS)
.setConstraints(Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.setRequiresBatteryNotLow(true)
.build())
.build()
WorkManager.getInstance().enqueueUniquePeriodicWork(SYNC_HIGHLIGHTS,
ExistingPeriodicWorkPolicy.KEEP, request)
}
according to the logs TimeUnit.HOURS is ignored and we get them several times per second on SDK < 23
java.util.concurrent.CancellationException: Task was cancelled.
at androidx.work.impl.utils.futures.AbstractFuture.cancellationExceptionWithCause(AbstractFuture.java:1184)
at androidx.work.impl.utils.futures.AbstractFuture.getDoneValue(AbstractFuture.java:514)
at androidx.work.impl.utils.futures.AbstractFuture.get(AbstractFuture.java:475)
at androidx.work.impl.WorkerWrapper$2.run(WorkerWrapper.java:245)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:841)
In our case:
fun scheduleHighlightsSync() {
val request = PeriodicWorkRequest.Builder(SyncHighlightsWorker::class.java, 1, TimeUnit.HOURS)
.setConstraints(Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.setRequiresBatteryNotLow(true)
.build())
.build()
WorkManager.getInstance().enqueueUniquePeriodicWork(SYNC_HIGHLIGHTS,
ExistingPeriodicWorkPolicy.KEEP, request)
}
according to the logs TimeUnit.HOURS is ignored and we get them several times per second on SDK < 23
ru...@gmail.com <ru...@gmail.com> #19
01-09 18:36:22.440 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:36:22.450 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:36:22.740 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:36:22.750 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:36:23.050 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
01-09 18:36:52.350 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:36:52.360 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:36:52.650 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:36:52.650 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:36:52.950 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
01-09 18:37:22.350 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:37:22.380 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:37:22.660 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:37:22.680 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:37:22.960 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
01-09 18:38:22.350 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:38:22.390 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:38:22.660 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:38:22.690 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:38:22.950 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
Logs when all workers are affected. Even though Result.sucess() was returned, due to this issue, WorkManager keeps retrying them. You can see the exponential backoff starting at 30 sec and going to 1 min, 2 min, 4 min etc
01-09 18:36:22.450 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:36:22.740 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:36:22.750 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:36:23.050 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
01-09 18:36:52.350 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:36:52.360 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:36:52.650 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:36:52.650 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:36:52.950 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
01-09 18:37:22.350 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:37:22.380 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:37:22.660 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:37:22.680 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:37:22.960 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
01-09 18:38:22.350 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerA: I did some work
01-09 18:38:22.390 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerB: I did some work
01-09 18:38:22.660 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerC: I did some work
01-09 18:38:22.690 12508-12528/com.github.rubensousa.workkitkatbug D/TestWorkerD: I did some work
01-09 18:38:22.950 12508-12529/com.github.rubensousa.workkitkatbug D/TestWorkerE: I did some work
Logs when all workers are affected. Even though Result.sucess() was returned, due to this issue, WorkManager keeps retrying them. You can see the exponential backoff starting at 30 sec and going to 1 min, 2 min, 4 min etc
to...@gmail.com <to...@gmail.com> #20
Yes, definitely not 15 minutes as in your github example project.
to...@gmail.com <to...@gmail.com> #21
I think if this issue can not be solved quickly, https://developer.android.com/topic/libraries/architecture/workmanager/ needs a big warning message for SDK < 23 - at least for periodic jobs
to...@gmail.com <to...@gmail.com> #24
Thanks!!! Is there a time estimate for the next beta ?
ru...@gmail.com <ru...@gmail.com> #25
#23: Thank you so much for taking another look and solving the issue :)
su...@google.com <su...@google.com> #26
#24: Most likely next week.
Description
Version used: 1.0.0-beta1
Devices/Android versions reproduced on: API 19
This seems to be happening on API 19 only.
To reproduce, just add a Thread.sleep(1000) in any worker before returning Result.SUCCESS. Worker is stopped due to cancellation instead of being treated as being successful. This causes the worker to being retried.
In our case, the behavior is the same. We're using RxJava and blocking observable operators to perform network requests. This issue causes API 19 devices to keep running the same worker using the backoff strategy instead of marking it as success.
Sample project: