Status Update
Comments
cc...@google.com <cc...@google.com>
cc...@google.com <cc...@google.com> #2
Made some progress, but haven't gotten it to work:
Permission error is because we declare the external storage permission in androidTest, which doesn't get pulled into main manifest - main manifest needs to declare the permission. Can workaround by redeclaring, or moving to benchmark-junit to a implementation
dependency.
Failure to launch activity is because, similar to above, activities need to be declared in main app. AFAICT, you can't have app/src/androidTest/.../MyActivity
- Android will refuse to resolve it. (you can't redeclare the manifest entry, Android won't find it, won't be able to am start it)
Had to move compose to be a regular implementation
dependency, because otherwise the plugin complains that it can't detect your compose runtime version.
Downside though is that to do this we have to split the code of the benchmark between app and test apks, which doesn't seem to work with minification. Keeping a test class in the test apk doesn't preserve that class's dependencies, if they're defined in the app. That means manual keep rules for basically everything, which... doesn't scale.
We can't have all the code in the test app, since Compose needs that activity, and plugin barfs with androidTestImplementation (and 2 things we can workaround in benchmark - permission and activity).
We may be able to force all the code into the app, but using an additional srcSet didn't seem to work.
cc...@google.com <cc...@google.com> #3
Good news - got it running, code is fairly elegant, and R8 performance differences are clearly measurable:
Bad news - can't run tests from studio anymore with this configuration. I tried a lot of hacky things to get it to work, wasn't able to - Studio always tries to run the tests host-side.
Sample perf diff for DeepTreeBenchmark:
Time Alloc TimeR8 AllocR8 R8speedup BenchName
12384910 4384 9159792 4346 35.21% benchmark_deep_tree_01_depth1_breadth100_wrap2
3546867 868 2692396 827 31.74% benchmark_deep_tree_02_depth7_breadth3_wrap2
4342214 1127 3238601 1086 34.08% benchmark_deep_tree_03_depth2_breadth10_wrap2
4346025 1127 3204551 1086 35.62% benchmark_deep_tree_04_depth2_breadth10_wrap6
Chuck / Filip, do you think it's useful enough to use this CL to evaluate performance locally? If not, is having these R8 numbers in CI worth breaking running via Studio temporarily until studio support is added (or alternately the work to manually split the compose benchmarks between app / test)? That split would need to look something like this:
// src/main:
@Keep
fun thingToMeasure() { /* code to be measured here */ }
// src/androidTest:
@Test
fun benchmarkThing = benchmarkRule.measureRepeated { thingToMeasure() }
Dustin/Lukhnos - If you're interested in replicating perf measurements with R8 in multiplat, I'd recommend going with the main/androidTest split approach above. Happy to help if you have questions.
cc...@google.com <cc...@google.com> #4
Yigit pointed out that you can run gradle tasks via studio, which allows you to run Benchmarks in this configuration from Studio (see attached screenshot). However, it doesn't display the output in Studio, but that may change in Android Studio Arctic Fox.
ch...@google.com <ch...@google.com> #5
Wow, the performance improvement was way more than I was expecting. I was expecting more along the lines of 5% not 20%-30%. Do other benchmarks improve in a similar way?
If so, I will take a background task of investigating which of the R8 optimizations have the biggest impact on compose. We might consider "pre-optimizing" our code to lessen the impact of R8 if we can get some of these improvements without R8. We want to perform well without R8 so R8 doesn't become a defacto requirement of compose.
As for studio, I don't run benchmarks in studio. I always use the command-line and a custom script to interpret the results so not being able to run them from Studio would not affect me.
However, I periodically will need to debug the benchmarks. Having a split, such as the above, would be very nice if the src/main
version was easy to debug (e.g. if the |>
worked).
du...@google.com <du...@google.com> #6
I had a go on this on my end and couldn't figure out how to get connectedCheck to recognize that there are test classes defined in src/main
- it's probably best try the split method, but we have some setup before hand so its not as clean in terms of being able to copy the classes back directly when minifyEnabled gets support on library modules.
du...@google.com <du...@google.com> #7
(This is on collections).
cc...@google.com <cc...@google.com> #8
I was expecting more along the lines of 5% not 20%-30%. Do other benchmarks improve in a similar way?
I don't know, I only looked at the one to get it up and running - you can cherry pick the linked CL to try it out on other benchmarks.
I always use the command-line and a custom script to interpret the results so not being able to run them from Studio would not affect me
Depending on which benchmarks we want to modify, it may hit others who are accustomed to running from Studio. If we want to commit this, keeping it scoped to runtime only may make sense.
couldn't figure out how to get connectedCheck to recognize
I don't think I did anything special, though I did play around with sourcesets a lot. Are you able to run this CL locally? It may have been only working for me due to some cached gradle/studio state.
cm...@google.com <cm...@google.com> #9
Just checking I follow your logic in #3 - the challenge is that you want all the test entry points to be roots for R8 to keep - which currently is not supported for app tests (and is potentially a bit odd for 'normal' app users as having the production shrunk build depending on tests might be suprising) and that moving them to the main sourceset breaks studio being able to run them.
So possibilities look like androidx only
- Create an app with the tests with different build types for shrunk/not shrunk
- For studio support split the @Test bodies in to a class in src/main and add keep rules for that class, or manually curate keep rules. agp/studio
- Fix studio running tests that are in main in an app
- Support tracing @Tests in androidTest when shrinking the app in AGP
- Add special support to AGP for shrinking library tests
- Shrinker support in library is
- Add test only module support for shrinking and library targets.
cc...@google.com <cc...@google.com> #10
Yes, (not sure what you mean by androidx only, since some of them are general solutions) though I'd add another possibility is the AGP support for macrobenchmarks source directory within existing modules, as we discussed a while back.
Also, the more general bug for R8 + Microbenchmark is here:
cc...@google.com <cc...@google.com>
cc...@google.com <cc...@google.com> #11
This is now possible with AGP changes, I was able to get simple tests running with minification on in a standalone project.
Once the required version of AGP lands in AndroidX, I'll add an example for local experimentation with R8 in microbenchmarks, and we can consider options for how we want to use this in CI.
These are the rules I used to test the integration in a standalone project:
# basic protection against junit/androidx.test reflection, shouldn't affect library/test code
-keepclasseswithmembers class androidx.test.** { *; }
-keepclasseswithmembers class org.junit.** { *; }
-dontwarn com.google.errorprone.annotations.MustBeClosed
## keep test classes
-keepclasseswithmembers @org.junit.runner.RunWith class * { *; }
## needed for org.junit.Test annotation to be discoverable by reflection
-keepattributes *Annotation*
ap...@google.com <ap...@google.com> #12
Branch: androidx-main
commit cecc57fb5899034f12f886a2784129098d059a8d
Author: Chris Craik <ccraik@google.com>
Date: Tue Mar 05 16:46:59 2024
Experimental R8 support in microbenchmarks
Off by default in the AndroidX repo, but can be experimentally used by
external. As this isn't run in presubmit/postsubmit, it may require
adding additional rules for any specific test.
```
(androidTest.enableMinification = false)
840,772 ns 389 allocs Trace LazyListScrollingBenchmark.scrollProgrammatically_newItemComposed[LazyColumn]
837,786 ns 389 allocs Trace LazyListScrollingBenchmark.scrollProgrammatically_newItemComposed[LazyRow]
(androidTest.enableMinification = true)
658,050 ns 390 allocs Trace LazyListScrollingBenchmark.scrollProgrammatically_newItemComposed[LazyColumn]
651,393 ns 390 allocs Trace LazyListScrollingBenchmark.scrollProgrammatically_newItemComposed[LazyRow]
```
(Mokey, locked clocks)
Test: LazyListScrollingBenchmark.scrollProgrammatically_newItemComposed
Fixes: 184378053
Relnote: """Experimental R8 support in microbench via embedded proguard
rules. Note that this support is experimental, and requires AGP 8.3
for minification of library module tests. Use the following to enable
in your benchmark module's `build.gradle`:
```
android {
buildTypes.release.androidTest.enableMinification = true
}
```
"""
Change-Id: I738a3294c5ded7b336ed0f49d0615eb9231cce51
M benchmark/benchmark-junit4/build.gradle
A benchmark/benchmark-junit4/proguard-rules.pro
M compose/benchmark-utils/build.gradle
A compose/benchmark-utils/proguard-rules.pro
M compose/foundation/foundation/benchmark/build.gradle
cc...@google.com <cc...@google.com> #13
With the above CL, this can be enabled locally with the following in your build.gradle:
android {
buildTypes.release.androidTest.enableMinification = true
}
No further proguard customization needed in most cases.
pr...@google.com <pr...@google.com> #14
The following release(s) address this bug.It is possible this bug has only been partially addressed:
androidx.benchmark:benchmark-junit4:1.3.0-alpha02
androidx.compose.foundation:foundation:1.7.0-alpha05
androidx.compose.foundation:foundation-android:1.7.0-alpha05
androidx.compose.foundation:foundation-desktop:1.7.0-alpha05
Description
Ideally we can track improvements for minified code (post r8 transforms) as compose sometimes make optimizations depending on these, e.g., changes to enable aot to inline.
It's currently impossible to apply:
to a benchmark module, even if has been converted to an app module because it fails with either:
1.
2.