Status Update
Comments
uc...@google.com <uc...@google.com>
[Deleted User] <[Deleted User]> #2
Our builds do contain native libs from 3rd party libs. Sorry about that. They do not contain native code from us though.
r....@gmail.com <r....@gmail.com> #3
We also tested the same benchmarks with AGP 4.2.0-beta01
and see the same issue.
ga...@google.com <ga...@google.com>
[Deleted User] <[Deleted User]> #4
It looks like using --rerun-tasks
cannot be used to reproduce the issue: then the problem goes away.
To reproduce, I've used ./gradlew clean <task> --no-build-cache
. But the issue showed up with normal cached executions.
ga...@google.com <ga...@google.com> #5
A quick question about the diff shown in image Thu Dec 10 2020 15:49:18 GMT-0800 (Pacific Standard Time).png
, what changed in the build? Was it build without any changes?
Also, just to clarify do you expect mergeDebugNativeLibs
to be FROM-CACHE
rather than up-to-date? Because you are running clean
task, its outputs will be removed and it cannot be up-to-date.
r....@gmail.com <r....@gmail.com> #6
If you manually invoke the tasks and you clean, then we'd expect the result to be FROM-CACHE
. However, in the benchmarks we don't clean anything and there I'd expect the task to be UP-TO-DATE
and in some instances it is. The benchmark scenario looks like this:
incremental_build_abi_change_in_account_module_spos {
tasks = [":..:assembleDebug"]
apply-abi-change-to = "common/account-service/public/src/main/java/com../CreateBody.java"
apply-abi-change-to = "common/account-service/public/src/main/java/com..CreateResponse.java"
apply-abi-change-to = "common/account-service/public/src/main/java/com/..AccountStatusStandardResponse.kt"
show-build-cache-size = true
warm-ups = 4
gradle-args = ["--offline", "--no-build-cache"]
}
The screenshot is from the benchmark scenario.
sp...@google.com <sp...@google.com> #7
Note 2: when running with
--rerun-tasks
we can see that the task always run fast.
Roughly how long does the task take with --rerun-tasks
?
We can see that the task mergeDebugNativeLibs has a random up-to-date status and when it runs, it takes around 160 seconds.
Roughly how long does the task take when it's UP-TO-DATE?
My suspicion is that the random UP-TO-DATE status is because the externalNativeLibs
task input is annotated with @Classpath
which is sensitive to the order of files, and perhaps the order isn't deterministic. In Thu Dec 10 2020 15:49:18 GMT-0800 (Pacific Standard Time).png
, it says some files were removed and added, but I suspect that all of those files were both removed and added (looks like Gradle only lists the first 3 reasons), because that is how Gradle reports changes in the order of files (see
r....@gmail.com <r....@gmail.com> #8
Roughly how long does the task take with
--rerun-tasks
?
Usually only a few seconds, around 3-5 I think.
Roughly how long does the task take when it's UP-TO-DATE?
It's very fast. The last build scan reported 0.032s
.
ga...@google.com <ga...@google.com> #9
I think we have 2 separate issues here:
- if
classpath
input for mergeDebugNativeLibs changes order, this task may take too long. We should look into that. I've asked Stéphane to share some CPU snapshots if possible. classpath
order changes between builds. This impactsmergeDebugNativeLibs
, but Stéphane shared that this also impacts other tasks e.gkaptGenerateStubsDebugKotlin
(classpath
file input property),desugarDebugFileDependencies
(classpath
property),compileDebugJavaWithJavac
(classpath
input property),dexBuilderDebug
(dexParam.desugarClasspath
input property).
sp...@google.com <sp...@google.com> #10
I've asked Stéphane to share some CPU snapshots if possible.
Yes, some CPU snapshots would be very helpful
[Deleted User] <[Deleted User]> #11
Here is a snapshot obtained via Gradle Profiler for the same scenario as the screenshots (incremental_build_abi_change_in_account_module_spos). I will provide the whole folder to @gavra via slack.
sp...@google.com <sp...@google.com> #12
Thanks! Looks like the issue is FileUtils.isFileInDirectory
. I have a WIP change from a couple months ago (ChangeId Ia88494c0b353545bf1ea3035ef8f165c1f5923e4) to improve the performance of this method, but I never merged it :/
[Deleted User] <[Deleted User]> #13
Is there any chance this could be ported to a 4.1 fix release or 4.2 ? I would be happy to test a branch / build of AGP for this (through Ivan ?)
je...@google.com <je...@google.com> #14
On Wed, Dec 16, 2020 at 6:40 AM <buganizer-system@google.com> wrote:
sp...@google.com <sp...@google.com> #15
I was able to repro this issue by creating an app project with dependencies on 500 different AARs with .so files.
When I shuffled the order of the AAR dependencies in the build.gradle file to simulate the nondeterministic ordering of the externalLibNativeLibs
task input, the mergeDebugNativeLibs
task takes about 50 seconds.
One possible workaround for this issue would be to use @InputFiles
instead of @Classpath
in MergeNativeLibsTask
.
Re #9, is there a bug filed for the 2nd issue, Ivan?
[Deleted User] <[Deleted User]> #16
Note that a bug report has been open on the Gradle repo as well as it seems that the cache key computation issue is more general than the mergeDebugNativeLibs
task :
To summarize: mergeDebugNativeLibs
seems to have a performance issue in incremental mode and this is the focus of this report, while the gradle ticket focuses on the reproducibility of the cache key computation for a few classes that see their inputs order shuffled.
ga...@google.com <ga...@google.com> #17
Scott, can we try to optimize an incremental scenario with many changes, and if the cost is still too high, we can fallback to clean run if number of changed files is greater than some threshold? WRT other issue (classpath ordering), I've just filled
sp...@google.com <sp...@google.com> #18
Discussed offline. Current plan is the following:
- In 4.2, run the task non-incrementally if the number of changed inputs is above a certain threshold
- In 7.0, fix the performance issue for incremental runs
sp...@google.com <sp...@google.com> #19
4.2 beta 4 has the fix (workaround) for this. It's slated for release on Monday, Jan 11.
ga...@gmail.com <ga...@gmail.com> #20
[Deleted User] <[Deleted User]> #21
Thank you very much for the update @sp, we're eager to try it out.
sp...@google.com <sp...@google.com>
[Deleted User] <[Deleted User]> #22
Confirmed, it seems to be fixed in AGP 4.2 beta 04. Thx you very much for all your work!
ha...@gmail.com <ha...@gmail.com> #23
```
android {
// yout existing code
packagingOptions {
pickFirst '**/libc++_shared.so'
pickFirst '**/libfbjni.so'
}
}
```
Description
Studio Build: Version of Gradle Plugin: 4.1 Version of Gradle: 6.7 Version of Java: JDK 11.0+ Version of Kotlin Gradle Plugin: 1.4.20 OS: Mac 10.15.7
Steps to Reproduce: launch a build from command line. The problem is intermittent. It randomly impacts our developers and impacts our benchmarks (using gradle profiler, 6 measured builds, 50% are faulty). The build does not contain native libraries.
We can see that the task
mergeDebugNativeLibs
has a random up-to-date status and when it runs, it takes around 160 seconds. This has a large impact on our builds.When we compare builds with the faulty
mergeDebugNativeLibs
vs builds that go well, we can see (via Gradle Enterprise build scans) that the main difference is the up-to-date status of this task.Here attached are a few snapshots of the Gradle Enterprise scans.
To summarize:
Note that the way we run our benchmarks is 100% reproducible:
./gradlew clean <target> --no-build-cache
, but the up-to-date status of this task is random.Note 2: when running with
--rerun-tasks
we can see that the task always run fast.Note 3: we are aware that Gradle Enterprise has some issues with worker based tasks and can show a task duration much larger than it actually is because it waits for dependent tasks, but due to the inconsistency of the repro, we could not have a more accurate diagnostic (cleanTask, re-run task). And we do see that the task duration adds up to the build total duration so we don't think this is just a display issue here.