Status Update
Comments
jp...@samsung.corp-partner.google.com <jp...@samsung.corp-partner.google.com> #3
Is there an ETA for the next release?
jp...@samsung.corp-partner.google.com <jp...@samsung.corp-partner.google.com> #4
se...@google.com <se...@google.com> #5
Branch: androidx-master-dev
commit a1957df3709a06f4e6482fb0e4d39ded4f230a70
Author: Rahul Ravikumar <rahulrav@google.com>
Date: Mon Jul 29 09:48:05 2019
Workaround NPE in PersistableBundle.getExtras().
Test: Existing unit tests pass. Ran integration test app.
Fixes:
Change-Id: I0b48e0009a7d83c343a3d26112b94c057470c281
M work/workmanager/src/main/java/androidx/work/impl/background/systemjob/SystemJobService.java
jp...@gmail.com <jp...@gmail.com> #6
jp...@gmail.com <jp...@gmail.com> #7
I found possible scenrio for this issue.
From the below log,
TIME_SET and BOOT_COMPLETED intent were received in a row before ForceStopRunnable is done.
In this situation,
there is a possibility that work does not call PendingResult.finish() correctly for the first intent.
Because second intent overwrites PendingResult for first one which is a field in WorkManagerImpl.
Could you please check with this?
Thanks
@RestrictTo(RestrictTo.Scope.LIBRARY_GROUP)
public void setReschedulePendingResult(
@NonNull BroadcastReceiver.PendingResult rescheduleReceiverResult) {
synchronized (sLock) {
mRescheduleReceiverResult = rescheduleReceiverResult;
if (mForceStopRunnableCompleted) {
mRescheduleReceiverResult.finish();
mRescheduleReceiverResult = null;
}
}
}
07-14 06:35:33.732 10098 12038 12125 D WM-ForceStopRunnable: Performing cleanup operations.
07-14 06:35:33.771 10098 12038 12038 D WM-RescheduleReceiver: Received intent Intent { act=android.intent.action.TIME_SET flg=0x25200010
07-14 06:35:33.774 10098 12038 12038 D WM-RescheduleReceiver: Received intent Intent { act=android.intent.action.BOOT_COMPLETED flg=0x89000010 cmp=com.samsung.android.scs/androidx.work.impl.background.systemalarm.RescheduleReceiver (has extras) } cmp=com.samsung.android.scs/androidx.work.impl.background.systemalarm.RescheduleReceiver }
07-14 06:35:33.804 10098 12038 12125 D WM-SystemJobScheduler: Reconciling jobs
07-14 06:35:33.861 10098 12038 12125 D WM-ForceStopRunnable: Found unfinished work, scheduling it.
07-14 06:35:33.884 10098 12038 12125 D WM-SystemJobScheduler: Scheduling work ID 5be13e13-9808-4c91-ab1a-8742f6336fff Job ID 2
07-14 06:35:33.903 10098 12038 12125 D WM-GreedyScheduler: Ignoring WorkSpec {WorkSpec: 0fc1a2cc-38a2-4b7f-87ae-d418f3c19b50}, Requires ContentUri triggers.
07-14 06:35:33.903 10098 12038 12125 D WM-GreedyScheduler: Ignoring WorkSpec {WorkSpec: 0d4e5f1c-4a79-4999-be6c-ae006f02cbb8}, Requires ContentUri triggers.
jp...@gmail.com <jp...@gmail.com> #8
I think you can fix with changing mRescheduleReceiverResult's type to List<> not to overwrite existing.
Or just call finish() before overwrite if there is exisitng.
Thanks
se...@google.com <se...@google.com> #10
Ack, yeah there is a bug in this scenario. We'll try to address it in next release.
li...@samsung.corp-partner.google.com <li...@samsung.corp-partner.google.com> #11
Thanks for your information (#10 comment). Can we know the next release plan and schedule?
se...@google.com <se...@google.com> #12
it will be in 2.9.0-alpha01 release which likely be early next year.
ya...@gmail.com <ya...@gmail.com> #13
We are facing high number of ANR related to androidx.work.impl.background.systemalarm.RescheduleReceiver too, after we migrate from API 31 to API 33.
The characteristic of such ANR are
- Only happen in devices with Android 13.
- Only happen in background, not foreground.
Based on comment from
We also suspect Firebase libraries are the real cause based on some reports -
Even we have upgraded from com.google.firebase:firebase-bom:31.1.0 to com.google.firebase:firebase-bom:31.2.2 (latest to-date), we still observe a high number of ANR in Android 13, background.
We also upgraded to use 2.8.0 WorkManager.
May I know, will 2.9.0-alpha01 able to resolve such ANR? -
Thank you.
ya...@gmail.com <ya...@gmail.com> #14
Hi,
Wondering when will 2.9.0-alpha01 be available for us to try out? Thank you very much.
li...@gmail.com <li...@gmail.com> #15
I think the fix being mentioned above by you to be integrated/included into "2.9.0-alpha01 release" should be the following patch, right?
>>>
Fix ANR in RescheduleReceiver
bug: 242248365
Test: manual (shame)
Change-Id: Ia68ef2e9caf7c7daea94c401724396189071b9a1
>>>
However, it's just reverted last week, and would you pls share some idea about this change? Any new solution, or workaround... to fix this issue in the near future?
As you can noticed, more and more people are sufferring from this bug recently. Thanks.
>>>
Revert "Fix ANR in RescheduleReceiver"
This reverts commit 6eb1c8154551b81094bc6892441cf53edce43294.
Reason for revert: there is a spike in ANRs on T that was likely to be triggered by this change.
Change-Id: I39dbf3e5201f9210f2579bc7c166e31de44d03eb
bug: 237962896
>>>
ya...@gmail.com <ya...@gmail.com> #16
Hi Googlers,
Thank you for taking effort to address this issue. Allow me to provide more info, so that we can track down to the root cause.
Here's my app configuration, with healthy ANR number
API 31
def work_version = "2.7.1"
implementation "androidx.work:work-runtime:$work_version"
implementation "androidx.work:work-gcm:$work_version"
After we upgrade to API 33, we saw a huge spike in ANR (In Android 33 T, all happens in background)
API 33
def work_version = "2.7.1"
implementation "androidx.work:work-runtime:$work_version"
implementation "androidx.work:work-gcm:$work_version"
Here's the ANR spike we are seeing.
Native method - android.os.MessageQueue.nativePollOnce Broadcast of Intent { act=android.intent.action.TIME_SET cmp=com.xxx.yyy/androidx.work.impl.background.systemalarm.RescheduleReceiver }
android.os.MessageQueue.nativePollOnce Broadcast of Intent { act=android.intent.action.BOOT_COMPLETED cmp=com.xxx.yyy/androidx.work.impl.background.systemalarm.RescheduleReceiver }
Occurrence of TIME_SET ANR is having 3 to 4 times more than BOOT_COMPLETED ANR. A device will either hit TIME_SET ANR, or BOOT_COMPLETED ANR, but not both.
We make sure we upgrade every libraries to latest, including WorkManager.
API 33
def work_version = "2.8.0"
implementation "androidx.work:work-runtime:$work_version"
implementation "androidx.work:work-gcm:$work_version"
However, we are still facing high ANR in Android T.
Interestingly, I after placing my app in background, and trying to change the emulator time manually. Hopefully able to trigger RescheduleReceiver with TIME_SET. But, I didn't observe any logging from RescheduleReceiver.
I do have some questions regarding RescheduleReceiver call on goAsync
-
I expect when RescheduleReceiver call goAsync on main thread, it will start to schedule to run some "busy" task on user thread. But, I do not see such logic in RescheduleReceiver. Am I missing something?
-
I was pondering, if 1st goAsync occur, and then 2nd goAsync occur again before the mRescheduleReceiverResult.finish() associated with 1st goAsysnc is executed, can 2nd goAsync causes ANR?
Thank you for taking effort for tackling this issue. I attach some screenshots for your reference, hoping able to help in tracking down the root cause.
This week, I will roll out a remote config change, to disable WorkManager enqueuing in all Android 13 device (For few days). My purpose is to monitor how WM affects ANR rate. I will update more on my findings.
Please let me know, if you need more info from my side.
ya...@gmail.com <ya...@gmail.com> #17
A missing attachment for above thread.
se...@google.com <se...@google.com> #18
But, I do not see such logic in RescheduleReceiver. Am I missing something? I was pondering, if 1st goAsync occur, and then 2nd goAsync occur again before the mRescheduleReceiverResult.finish() associated with 1st goAsync is executed, can 2nd goAsync causes ANR?
We think that it is highly likely that second goAsync
triggers ANR. We actually even tried to address it in aosp/2401352. However apps that use nightly build of androidx (you can use it too,
ya...@gmail.com <ya...@gmail.com> #19
Hi Googlers,
It isn't clear on why the initial proposed fix (which is now reverted) can cause spike in ANR.
It isn't totally clear too, on why current 2.8.0 WM is causing ANR. Hypothesis provided by Samsung engineer might hold true.
I am not able to test BOOT_COMPLETED as it is not easy to test -
I however try to do a "fast" manual time adjustment, to fire 2 TIME_SET.
Using the following code
public class MyBroadcastReceiver extends BroadcastReceiver {
private static final Executor dateTimeReminderExecutor = Executors.newSingleThreadExecutor();
@Override
public void onReceive(Context context, Intent intent) {
Log.i("CHEOK", intent.getAction() + " : onReceive -> " + this);
final PendingResult result = goAsync();
dateTimeReminderExecutor.execute(() -> {
Log.i("CHEOK", "Begin sleep -> " + this);
try {
Thread.sleep(20000);
} catch (InterruptedException e) {
e.printStackTrace();
}
Log.i("CHEOK", "Finish sleep -> " + this);
try {
} finally {
if (result != null) {
result.finish();
Log.i("CHEOK", "result.finish() la -> " + this);
}
}
});
}
}
I obtain the following outcome, in both API 33 and API 32.
android.intent.action.TIME_SET : onReceive -> com.yocto.myapplication.MyBroadcastReceiver@9140719
Begin sleep -> com.yocto.myapplication.MyBroadcastReceiver@9140719
Finish sleep -> com.yocto.myapplication.MyBroadcastReceiver@9140719
result.finish() la -> com.yocto.myapplication.MyBroadcastReceiver@9140719
android.intent.action.TIME_SET : onReceive -> com.yocto.myapplication.MyBroadcastReceiver@92a09bf
Begin sleep -> com.yocto.myapplication.MyBroadcastReceiver@92a09bf
Finish sleep -> com.yocto.myapplication.MyBroadcastReceiver@92a09bf
result.finish() la -> com.yocto.myapplication.MyBroadcastReceiver@92a09bf
Based on the observation
- 1st TIME_SET will trigger BroadcastReceiver's onReceive.
- 2nd TIME_SET will not trigger BroadcastReceiver's onReceive, after 1st TIME_SET has called goAsync and before calling result.finish. 2nd TIME_SET will only trigger BroadcastReceiver's onReceive, after 1st TIME_SET call result.finish.
- Different instances of BroadcastReceiver are used to serve 1st and 2nd TIME_SET
If that is the case, 2nd action are still "waiting" to enter RescheduleReceiver's onReceive.
2nd action might not have chance to call WorkManagerImpl's setReschedulePendingResult to "accidentally" override member variable mRescheduleReceiverResult, because the 1st action has not called mRescheduleReceiverResult.finish()
This is my very rough initial observation, and they might not hold true.
I am not able to reproduce any ANR even I have "Enable background ANR dialogs". Can you?
Thank you for your efforts.
li...@gmail.com <li...@gmail.com> #20
Owner
Sergey Vasilinetc
Revert "Revert "Fix ANR in RescheduleReceiver""
This reverts commit 097b615030603a3c77c13e5384fca58172456f31.
Reason for revert: wrong reason to revert, spike in ANRs was wrongly attributed.
bug: 237962896
Change-Id: I622f6ae485abd87498a5c44c4a77e69b32c739e6
ya...@gmail.com <ya...@gmail.com> #21
May I know when will be released for testing? Thank you.
se...@google.com <se...@google.com> #22
It will be released as 2.8.1 in upcoming days.
ya...@gmail.com <ya...@gmail.com> #23
Thank you. I am looking forward the release.
ya...@gmail.com <ya...@gmail.com> #24
Hi, after releasing 2.8.1 to production for more than 1 week, I can confirm that this ANR is being reduced significantly.
Thank you for being responsive in this issue.
am...@gmail.com <am...@gmail.com> #25
I am facing these anr Kindly help me.
az...@gmail.com <az...@gmail.com> #26
The fix doesn't seem to work. I can see see ANRs happening, mostly on Samsung devices.
Edit: screenshot from Play Console "Crashes and ANRs" tab.
Description
Version used: 2.7
Devices/Android versions reproduced on: Gallery S22 S OS
Hi. I am a sw engieer and my project is using worker.
I got ANR issue below which comes from RescheduleReceiver.
06-07 10:23:49.655 1000 2195 14874 E ActivityManager: ANR in XXX
06-07 10:23:49.655 1000 2195 14874 E ActivityManager: PID: 12162
06-07 10:23:49.655 1000 2195 14874 E ActivityManager: Reason: Broadcast of Intent { act=android.intent.action.TIME_SET flg=0x25200010 cmp=com.samsung.android.scs/androidx.work.impl.background.systemalarm.RescheduleReceiver }
From my debugging,
worker register static RescheduleReceiver in AndroidManifest.
and this receiver use BroadcastReceiver.goAsync() method.
@Override
public void onReceive(Context context, Intent intent) {
Logger.get().debug(TAG, String.format("Received intent %s", intent));
if (Build.VERSION.SDK_INT >= WorkManagerImpl.MIN_JOB_SCHEDULER_API_LEVEL) {
try {
WorkManagerImpl workManager = WorkManagerImpl.getInstance(context);
final PendingResult pendingResult = goAsync();
workManager.setReschedulePendingResult(pendingResult);
As described developer site,
when you use goAync() method, you must call PendingResult.finish() after work.
But in workManager.setReschedulePendingResult(pendingResult);
there is a posibility not to call PendingResult.finish() when ForceStopRunnable is not compeleted.
public void setReschedulePendingResult(
@NonNull BroadcastReceiver.PendingResult rescheduleReceiverResult) {
synchronized (sLock) {
mRescheduleReceiverResult = rescheduleReceiverResult;
if (mForceStopRunnableCompleted) {
mRescheduleReceiverResult.finish();
mRescheduleReceiverResult = null;
}
}
}
This ANR is reproducible only when the process is launched by RescheduleReceiver.
You may can test with these intent.
<action android:name="android.intent.action.BOOT_COMPLETED" />
<action android:name="android.intent.action.TIME_SET" />
<action android:name="android.intent.action.TIMEZONE_CHANGED" />
This is not a hundred percent issue but it's reproducible sometimes as timing issue.
Also Similar issues were already reported before
It means there is a problem.
ForceStopRunnable is triggered on background. It cannot guarantee that it's completed when BR handles this intent.
Please check this call stack.
WorkManagerImpl's constructor schedule ForceStopRunnable with executor and return directly.
WorkManagerImp.java
Constructor --> internalInit() --> mWorkTaskExecutor.executeOnBackgroundThread(new ForceStopRunnable(context, this));
And in here RescheduleReceiver .onReceive()
There is no logic to hold until ForceStopRunnable is completed.
This issue is not related to the main looper, it's mWorkTaskExecutor.
If an executor is busy, there is a possibility that ForceStopRunnable is delayed.
When it is delayed, mForceStopRunnableCompleted field can be false and returns onRecieve() with not finishing PendingResult.
Could you please check with this? Thanks
RescheduleReceiver.java
@Override
public void onReceive(Context context, Intent intent) {
...
WorkManagerImpl workManager = WorkManagerImpl.getInstance(context);
final PendingResult pendingResult = goAsync();
workManager.setReschedulePendingResult(pendingResult);
...
}
WorkManagerImp.java
public void setReschedulePendingResult(
@NonNull BroadcastReceiver.PendingResult rescheduleReceiverResult) {
synchronized (sLock) {
mRescheduleReceiverResult = rescheduleReceiverResult;
if (mForceStopRunnableCompleted) {
mRescheduleReceiverResult.finish();
mRescheduleReceiverResult = null;
}
}
}