Status Update
Comments
sg...@google.com <sg...@google.com> #2
Thank you for the report.
When the compile time fluctuate this much, it could be caused by the JVM getting close to max memory and the GC portion of the compile time goes up.
As you have a dump, then you can easily reproduce the same build with different settings using tools/compiledump.py
. tools/compiledump.py
use the curl -O https://storage.googleapis.com/r8-releases/raw/8.3.35/r8lib.jar
(for version 8.3.35), and then add --r8-jar r8lib.jar
to the commands below. The commands all pass -da
which disables assertions. tools/compiledump.py
enabled assertions by default, but that makes compilation even slower, and AGP does not do that.
- Are you seeing the same loooooong compile time locally when running
tools/compiledump.py -da -d <dump>
? - If so you can try to pass the
-verbose:gc
to the JVM when running, should be doable like this:tools/compiledump.py -da -J=-verbose:gc -d <dump>
. Seems like the output is not streamed so you will have to wait for the command to finish. - If that shows massive GC, you can try different heap sizes, to see if changing that affects compilation time, e.g. like this:
tools/compiledump.py -da --xmx 8G -J=-verbose:gc -d <dump>
If possible you can share the dump privately with
be...@gmail.com <be...@gmail.com> #3
The CI calculates the values for xmx etc... depending on the current machine. The one use here should have enough resources to build the app without issues.
This is what it calculated for this particular machine, which has 32 GB of RAM (properties written inside of ~/.gradle/gradle.proprties
).
org.gradle.jvmargs = -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError -XX:MaxMetaspaceSize=1g -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -Xmx16854m -Xms16854m
kotlin.daemon.jvmargs = -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError -XX:MaxMetaspaceSize=1g -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -Xmx7223m -Xms7223m
I let gradle and kotlin use 75% of the total memory, 70% of that is for gradle, the remaining 30% for kotlin.
I once ran the build with -Dcom.android.tools.r8.printmemory
, but the CI stopped after it reached its limit of 3.5 hours. This is what I saw:
R8 is running with total memory:17683185664
R8 is running with free memory:14891000672
R8 is running with max memory:17683185664
I have multiple zips inside the directory of dumpinputtodirectory
and compiledump.py
takes roughly 10 minutes to process the largest one. Even when I pass --Xmx 8G
.
I'll try to play a bit with the memory settings and see if that's the problem, as you suspect. I could also try to run R8 in a separate process as mentioned
I'll try to can figure this out and I'll see if I can share the dump in case I can't find anything.
Thank you for the prompt response.
sg...@google.com <sg...@google.com> #4
I think you were right about this being a memory issue. I added the following and even after multiple tries, the minification step took 10 minutes consistently.
android {
execution {
profiles {
standard {
r8 {
jvmOptions += ["-Xms2048m", "-Xmx8192m", "-XX:+HeapDumpOnOutOfMemoryError"]
runInSeparateProcess = true
}
}
}
defaultProfile = "standard"
}
}
I don't know if something has changed, I thought 16GB were enough for gradle and R8 combined, considering that a build without R8 requires less than 8GB.
Anyway, thank you for the pointer, I think you can close this now.
di...@gmail.com <di...@gmail.com> #5
Thank you for the feedback, good to hear that you found a solution.
I had already written what is below, so sharing this anyway.
Each dump has the file build.properties
in the root. The one which has tool=R8
is the R8 invocation.
One other tip is to unzip the dump unzip dump.zip -d /tmp/dump
, and add --temp /tmp/dump
to running tools/compiledump.py
. Then the command printed by tools/compiledump.py
(shuld be like the one below) can be easily re-run with JVM different options without foing through the tools/compiledump.py
wrapper.
/usr/local/google/home/sgjesse/prj/r8/ws1/third_party/openjdk/jdk-11/linux/bin/javac /usr/local/google/home/sgjesse/prj/r8/ws1/src/main/java/com/android/tools/r8/utils/CompileDumpCompatR8.java /usr/local/google/home/sgjesse/prj/r8/ws1/src/main/java/com/android/tools/r8/utils/CompileDumpBase.java -d /tmp/dump -cp r8lib.jar
Running: /usr/local/google/home/sgjesse/prj/r8/ws1/third_party/openjdk/jdk-11/linux/bin/java -verbose:gc -cp /tmp/dump:r8lib.jar com.android.tools.r8.utils.CompileDumpCompatR8 --release /tmp/dump/program.jar --output /tmp/dump/out.jar --lib /tmp/dump/library.jar --classpath /tmp/dump/classpath.jar --desugared-lib /tmp/dump/desugared-library.json --desugared-lib-pg-conf-output /tmp/dump/desugared-library-keep-rules.config --pg-conf /tmp/dump/proguard.config --pg-map-output /tmp/dump/out.jar.map --min-api 23 --enable-missing-library-api-modeling
sg...@google.com <sg...@google.com> #6
I am not exactly sure if there are any restrictions on the number of R8 tasks that AGP will run in parallel. There has been issues with this as the Gradle daemon could run out of memory if too many R8 tasks where running in parallel. In AGP 8.0
Adding
ch...@google.com <ch...@google.com> #7
I moved the issue (
yy...@gmail.com <yy...@gmail.com> #8
Is there any progress?
Description
Detailed time consumption is as follows:
21:26:14 R8: 777620ms
21:26:14 - (6%) Unaccounted: 51281ms
21:26:14 - (38%) Strip unused code: 302333ms
21:26:14 - (2%) Unaccounted: 15949ms
21:26:14 - (34%) Grow the tree.: 268155ms
21:26:14 - (30%) Unaccounted: 238018ms
21:26:14 - (3%) Find consequent items for -if rules...: 30136ms
21:26:14 - (13%) Phase 2: promoteToPublic: 108266ms
21:26:14 - (7%) Create IR: 60500ms
21:26:14 - (2%) Build primary method processor: 17448ms
21:26:14 - (4%) IR conversion phase 1: 38719ms
21:26:14 - MERGE primary-processor: 2971275ms, tasks: 1362541, threads: 128, utilization: 59%
21:26:14 - (2%) Lens rewrite: 63104ms
21:26:14 - (5%) Insert assume instructions: 173443ms
21:26:14 - (2%) Part 1: Compute assumed values: 72802ms
21:26:14 - (2%) Part 3: Compute dominated users: 68523ms
21:26:14 - (2%) Propogate sparse conditionals: 75496ms
21:26:14 - (4%) Remove field loads: 122187ms
21:26:14 - (18%) Remove dead code: 543474ms
21:26:14 - (2%) Canonicalize constants: 84839ms
21:26:14 - (4%) Shorten live ranges: 134336ms
21:26:14 - (35%) Finalize IR: 1062929ms
21:26:14 - (6%) Unaccounted: 195741ms
21:26:14 - (21%) Allocate registers: 647859ms
21:26:14 - (7%) Peephole optimize: 213565ms
21:26:14 - SLOWEST (0%) void com.bytedance.ies.abmock.ConfigContainer.addConfigStub3(): 10695ms
21:26:14 - (18%) Post optimization code stripping: 145686ms
21:26:14 - (17%) Grow the tree.: 133269ms
21:26:14 - (16%) Unaccounted: 127897ms
21:26:14 - (4%) Minification: 38497ms
21:26:14 - (3%) MinifyMethods: 29094ms
21:26:14 - (2%) Phase 3: 21206ms
21:26:14 - (2%) Interface minification: 21206ms
21:26:14 - (2%) Rename in groups: 18178ms
21:26:14 - (6%) DexApplication.write: 48318ms
21:26:14 - (2%) Distribute: 23268ms
21:26:14 - MERGE Pre-write phase: 16685ms, tasks: 55, threads: 128, utilization: 0%
21:26:14 - (96%) Compute object offset mapping: 16057ms
21:26:14 - (41%) Sort strings: 6987ms
21:26:14 - (12%) Sort types: 2107ms
21:26:14 - (9%) Sort classes: 1546ms
21:26:14 - (5%) Sort protos: 927ms
21:26:14 - (18%) Sort methods: 3072ms
21:26:14 - (7%) Sort fields: 1325ms
21:26:14 - SLOWEST (3%) VirtualFile 22: 615ms
21:26:14 - (3%) Compute object offset mapping: 615ms
21:26:14 - (2%) Line number remapping: 16013ms
21:26:14 - MERGE Write files: 43226ms, tasks: 55, threads: 128, utilization: 0%
21:26:14 - (4%) Reindex for lazy strings: 2156ms
21:26:14 - (87%) Write bytes: 37844ms
21:26:14 - (35%) collect: 15210ms
21:26:14 - (51%) generate: 22427ms
21:26:14 - (3%) Pass bytes to consumer: 1567ms
21:26:14 - SLOWEST (3%) VirtualFile 6: 1655ms
21:26:14 - (3%) Write bytes: 1496ms
21:26:14 - (2%) generate: 880ms
As you can see, the most time-consuming part is Grow the tree, it takes up to 34%+17%=51% of all the time.
After reviewing the r8 code, I found it point to code in trace(ExecutorService executorService, Timing timing) as follows:
private void trace(ExecutorService executorService, Timing timing) throws ExecutionException {
timing.begin("Grow the tree.");
try {
while (true) {
long numberOfLiveItems = getNumberOfLiveItems();
while (!workList.isEmpty()) {
EnqueuerAction action = workList.poll();
action.run(this);
}
...
}
So, I'm wondering, why not execute these actions in parallel? Are there some special reasons? Do you have road map for this?