Status Update
Comments
mi...@perplexity.ai <mi...@perplexity.ai> #2
I am able to also reproduce issue #1 now; I've committed a change to the sample project that sets enableMultiInstanceInvalidation()
on the DB and adjusts the timing a bit more. After running it for 15 minutes or so I am seeing multiple skipped updates.
My production app uses enableMultiInstanceInvalidation()
as well, so this may be a significant factor.
ti...@google.com <ti...@google.com> #3
Any thoughts on this one? I mean, it makes JournalMode.TRUNCATE largely useless, unless one has a penchant for random app misbehaviors. I'd like to keep using TRUNCATE, as it avoids certain headaches from WRITE_AHEAD_LOGGING.
mi...@perplexity.ai <mi...@perplexity.ai> #4
To be clear, "batching" does not take care of the missed invalidation notifications. Updates for a table may be spaced hours apart, and in my production app about 5% of notifications are lost, meaning that the user does not see the update until the next one comes in, which may be hours later. It's a chatroom feature, where reliable, timely updates are important.
Also, not setting enableMultiInstanceInvalidation() does not fix the issue, it only changes the timing; notifications go missing regardless of whether I include one process or multiple.
ti...@google.com <ti...@google.com> #5
Hi - Sorry, we haven't had the chance to investigate this. I know this might be a lot to ask and thanks for giving us a sample app, but have you try adding a transaction in the invalidation tracker? You can check out Room's source code here:
Also what headaches are you trying to avoid from WAL mode?
mi...@perplexity.ai <mi...@perplexity.ai> #6
Thanks for the pointer. I've built Room from source now as you suggested and do see that the problem goes away once I use the transactionality block in InvalidationTracker.mRefreshRunnable for all JournalModes. Don't know why that wasn't done in the first place, it certainly is a misconception to assume that TRUNCATE does not need it.
Regarding headaches from WAL mode:
- Android's SqliteDatabase runs TRUNCATE transactions effectively serializable due to grabbing exclusive locks right at the start of a transaction, so I can worry less about transactions failing, at the cost of reduced concurrency. My app isn't that heavy on DB ops, so favoring reliability over concurrency makes lots of sense. I'd actually assume that this is true for most Android apps, and that TRUNCATE would make a better default, not fancy WAL.
- WAL has certain other disadvantages, as pointed out here:
https://sqlite.org/wal.html In particular the risk of extended failure from SQLITE_BUSY is high for my app as it has DB connections from two processes (https://sqlite.org/wal.html#busy ); of particular concern is the recovery case after a crash, where one process may hold an exclusive lock for an extended period of time at startup, essentially causing the other process to error out without much recourse.
Hope this can be fixed soon in an official build; seems a trivial change.
ti...@google.com <ti...@google.com> #7
mi...@perplexity.ai <mi...@perplexity.ai> #8
Yes please! It would be ideal if you can write a test for this though, in the test app specifically:
Similar to your sample app, open a DB in TRUNCATE mode, have a thread that inserts a lot and another one reading notifications and at the end make sure the amount of notifications received match the amount of items inserted.
ti...@google.com <ti...@google.com>
mi...@perplexity.ai <mi...@perplexity.ai> #9
mi...@perplexity.ai <mi...@perplexity.ai> #10
just to clarify on #8, we cannot count the number of invalidation events as database might combine them. Instead, we need to make sure latest value is always eventually dispatched.
ti...@google.com <ti...@google.com> #11
As it's an intermittent problem (triggered by specific transaction timing), it's difficult to surface in the first place, so this type of instrumented test cannot really reliably find the problem, it was hard enough trying to surface it with manually run code in a controlled environment. Also, the longer of a time interval we choose, the harder it gets to reproduce within a finite amount of time.
I think the main concern with fixing the issue is not actually the PR's ability to address the issue, but the potential to introduce regressions. I have no idea why there was no transactionality for the TRUNCATE path, was it really just because the thinking was that it was superfluous? This can only be addressed by existing test coverage and maybe insight from whoever wrote the original code and other knowledegable team members. I traced its history and found no helpful pointers, I remember finding that it always looked like this since it was introduced in the early days of Room.
Do you guys really want the sketchy black box test, or maybe just stick with some extra code documentation, existing regression tests, plus the repro case? It almost seems not worth the trouble.
mi...@perplexity.ai <mi...@perplexity.ai> #12
Seems like it was added here:
For testing, I agree it is not great as it only shows as a flake but we have a couple tests stress similar to that and we usually use 10secs (to account for cloud devices). Under normal circumstances (device is not slowed down and db is not locked), it shouldn't take more than ~100ms to get the update after the right (usually much faster). But with virtual device testing, we cannot rely on timing hence we pick large enough numbers (if a virtual device idles for 10 seconds, than that is an infra problem).
It is at least better than nothing and if it flakes, that'll keep bugging us until finding a better solution.
Btw, if you can detect the exact ordering of events that will cause the problem, then we can add package private restricted APIs from the InvalidationTracker to have fine tuned control over them in tests.
ti...@google.com <ti...@google.com> #13
mi...@perplexity.ai <mi...@perplexity.ai> #14
oh-oh, I'm sorry to hear that, we definitely don't require you to enter any SSH key password. If anything, only the built-in Git integration in Android Studio will ask you for Github credentials if you use it to push changes to Github. Maybe you have some additional plugin that is asking for it? The Github-based setup downloads the correct Android Studio version needed for the project, but some plugins are installed separately and can persist across IDE installations.
ap...@google.com <ap...@google.com> #15
#13 i'm curious what went wrong w/ the Github setup. I don't want to hijack this bug but if you can either file an issue at
pr...@google.com <pr...@google.com> #16
I don't think I will add the test; ultimately I am unfamiliar with AOSP tooling and in particular the test setup for this, so it's an open-ended time commitment, and I already spent too much time on this tiny code problem.
I can still make a PR for the code change that fixes the bug (based on my testing via repro & actual app), but it's so trivial that you might just as well change it yourselves. As I mentioned in #10, the primary concern with the code change would be regressions, which are not captured by the envisioned test anyway, so I'd consider it defensible to go without test. The alternative, no fix, leaves TRUNCATE in a poor state.
Description
Jetpack Compose component used: Modifier.sharedElement
Hey! We added a shared element transition to the app, using Compose 1.7.3. We are seeing this crash in firebase, however I could not reproduce the issue locally.
Our layout hierarchy looks like this:
NavHost -> ... -> AnimatedVisibility -> ... -> BoxWithConstraints -> First layout with sharedElement
\> ... -> AnimatedVisibility -> ... -> BoxWithConstraints -> Second layout with sharedElement
With "..." I replaced series of "normal" layouts.
The crash is due to this precondition failing:
It seems somehow one of the elements gets drawn without having had layout before.
Any idea if I am doing something wrong, or any workarounds? Thank you!!
Stack trace:
SharedElementInternalState.drawInOverlay
java.lang.IllegalArgumentException - Error: current bounds not set yet.
androidx.compose.animation.SharedElementInternalState.drawInOverlay (SharedElementInternalState.java:196)
androidx.compose.animation.SharedTransitionScopeImpl.drawInOverlay$animation_release (SharedTransitionScopeImpl.java:1086)
androidx.compose.animation.SharedTransitionScopeKt$SharedTransitionScope$1$2$1.invoke (SharedTransitionScope.kt:161)
androidx.compose.animation.SharedTransitionScopeKt$SharedTransitionScope$1$2$1.invoke (SharedTransitionScope.kt:159)
androidx.compose.ui.draw.DrawWithContentModifier.draw (DrawModifier.kt:422)
androidx.compose.ui.node.LayoutNodeDrawScope.drawDirect-eZhPAX0$ui_release (LayoutNodeDrawScope.kt:110)
androidx.compose.ui.node.LayoutNodeDrawScope.draw-eZhPAX0$ui_release (LayoutNodeDrawScope.kt:89)
androidx.compose.ui.node.NodeCoordinator.drawContainedDrawModifiers (NodeCoordinator.kt:450)
androidx.compose.ui.node.NodeCoordinator.draw (NodeCoordinator.kt:439)
androidx.compose.ui.node.LayoutModifierNodeCoordinator.performDraw (LayoutModifierNodeCoordinator.kt:280)
androidx.compose.ui.node.NodeCoordinator.drawContainedDrawModifiers (NodeCoordinator.kt:447)
androidx.compose.ui.node.NodeCoordinator.draw (NodeCoordinator.kt:439)
androidx.compose.ui.node.LayoutNode.draw$ui_release (LayoutNode.kt:1000)
androidx.compose.ui.node.InnerNodeCoordinator.performDraw (InnerNodeCoordinator.kt:196)
androidx.compose.ui.node.LayoutNodeDrawScope.drawContent (LayoutNodeDrawScope.kt:68)
androidx.compose.foundation.BackgroundNode.draw (Background.kt:163)
androidx.compose.ui.node.LayoutNodeDrawScope.drawDirect-eZhPAX0$ui_release (LayoutNodeDrawScope.kt:110)
androidx.compose.ui.node.LayoutNodeDrawScope.draw-eZhPAX0$ui_release (LayoutNodeDrawScope.kt:89)
androidx.compose.ui.node.NodeCoordinator.drawContainedDrawModifiers (NodeCoordinator.kt:450)
androidx.compose.ui.node.NodeCoordinator.access$drawContainedDrawModifiers (NodeCoordinator.kt:58)
androidx.compose.ui.node.NodeCoordinator$drawBlock$1$1.invoke (NodeCoordinator.java:469)
androidx.compose.ui.node.NodeCoordinator$drawBlock$1$1.invoke (NodeCoordinator.java:468)
androidx.compose.runtime.snapshots.Snapshot$Companion.observe (Snapshot.java:2441)
androidx.compose.runtime.snapshots.SnapshotStateObserver$ObservedScopeMap.observe (SnapshotStateObserver.kt:502)
androidx.compose.runtime.snapshots.SnapshotStateObserver.observeReads (SnapshotStateObserver.kt:258)
androidx.compose.ui.node.OwnerSnapshotObserver.observeReads$ui_release (OwnerSnapshotObserver.kt:133)
androidx.compose.ui.node.NodeCoordinator$drawBlock$1.invoke (NodeCoordinator.java:468)
androidx.compose.ui.node.NodeCoordinator$drawBlock$1.invoke (NodeCoordinator.java:466)
androidx.compose.ui.platform.GraphicsLayerOwnerLayer$recordLambda$1.invoke (GraphicsLayerOwnerLayer.java:291)
androidx.compose.ui.platform.GraphicsLayerOwnerLayer$recordLambda$1.invoke (GraphicsLayerOwnerLayer.java:289)
androidx.compose.ui.graphics.layer.GraphicsLayerV29.record (GraphicsLayerV29.android.kt:245)
androidx.compose.ui.graphics.layer.GraphicsLayer.recordInternal (AndroidGraphicsLayer.android.kt:430)
androidx.compose.ui.graphics.layer.GraphicsLayer.record-mL-hObY (AndroidGraphicsLayer.android.kt:423)
androidx.compose.ui.platform.GraphicsLayerOwnerLayer.updateDisplayList (GraphicsLayerOwnerLayer.android.kt:284)
androidx.compose.ui.platform.GraphicsLayerOwnerLayer.drawLayer (GraphicsLayerOwnerLayer.android.kt:229)
androidx.compose.ui.node.NodeCoordinator.draw (NodeCoordinator.kt:434)
androidx.compose.ui.node.LayoutNode.draw$ui_release (LayoutNode.kt:1000)
androidx.compose.ui.node.InnerNodeCoordinator.performDraw (InnerNodeCoordinator.kt:196)
androidx.compose.ui.node.NodeCoordinator.drawContainedDrawModifiers (NodeCoordinator.kt:447)
androidx.compose.ui.node.NodeCoordinator.draw (NodeCoordinator.kt:439)
androidx.compose.ui.node.LayoutNode.draw$ui_release (LayoutNode.kt:1000)
androidx.compose.ui.platform.AndroidComposeView.dispatchDraw (AndroidComposeView.android.kt:1564)
android.view.View.draw (View.java:25180)
android.view.View.updateDisplayListIfDirty (View.java:24036)
android.view.ViewGroup.recreateChildDisplayList (ViewGroup.java:4764)
...