Initial benchmark runs after warmup are slower [142058671]

Fixed

Bug

Status Update

No update yet.

Description

cc...@google.com

created issue #1

Oct 3, 2019 04:47PM

Example data (trimmed json) showing the first run slowdown, affecting the maximum:

"timeNs": {
"minimum": 10,
"maximum": 28,
"median": 10,
"runs": [
28,
11,
10,
10,
10,
10,
10,
...

"name": "nothing",
"timeNs": {
"minimum": 9,
"maximum": 25,
"median": 9,
"runs": [
25,
9,
9,
9,
9,
9,
9,
...

Comments

ap...@google.com <ap...@google.com> #2Oct 8, 2019 04:37PM

Project: platform/frameworks/support
Branch: androidx-master-dev

commit 87230c0d52d3d534692a31c8ffbb378fc3508111
Author: Chris Craik <ccraik@google.com>
Date: Tue Oct 08 08:28:15 2019

Remove warmup log and array alloc from critical path

Bug:142058671

Test: ./gradlew benchmark:benchmark-benchmark:cC # with systrace / method tracing

When transitioning out of warmup, saw array allocation in method
trace, and log work in systrace. Still more improvements to make here.

Change-Id: Ibddbc3dd90e7802d3777f690be1a684ced4fc339

M benchmark/common/src/main/java/androidx/benchmark/BenchmarkState.kt
M benchmark/common/src/main/java/androidx/benchmark/WarmupManager.kt

https://android-review.googlesource.com/1134877

https://goto.google.com/android-sha1/87230c0d52d3d534692a31c8ffbb378fc3508111

ap...@google.com <ap...@google.com> #3Oct 8, 2019 11:27PM

Project: platform/frameworks/support
Branch: androidx-master-dev

commit b15ef41ce16efc5c3b7d3dfe0d04157ffa474ff2
Author: Chris Craik <ccraik@google.com>
Date: Tue Oct 08 15:06:18 2019

Add tracing to benchmark

Bug:142058671
Test: tests in benchmark-benchmark, with systrace

Change-Id: Ic7c35ad8c8b18e478ad2a19627df92e3241d8a0a

M benchmark/common/api/1.0.0-rc01.txt
M benchmark/common/api/current.txt
M benchmark/common/api/public_plus_experimental_1.0.0-rc01.txt
M benchmark/common/api/public_plus_experimental_current.txt
M benchmark/common/api/restricted_1.0.0-rc01.txt
M benchmark/common/api/restricted_current.txt
M benchmark/common/src/main/java/androidx/benchmark/BenchmarkState.kt
A benchmark/common/src/main/java/androidx/benchmark/TraceCompat.kt
M benchmark/junit4/src/main/java/androidx/benchmark/junit4/BenchmarkRule.kt

https://android-review.googlesource.com/1136816

https://goto.google.com/android-sha1/b15ef41ce16efc5c3b7d3dfe0d04157ffa474ff2

ap...@google.com <ap...@google.com> #4Oct 11, 2019 07:22PM

Project: platform/frameworks/support
Branch: androidx-master-dev

commit 345f86d34f7e373f464c0d8185e392b067a2de4a
Author: Chris Craik <ccraik@google.com>
Date: Wed Oct 09 17:37:17 2019

Bump thread priority of benchmarks and JIT during benchmarks

The JIT thread is so low priority that other parallel tasks can starve
it, especially for the first few benchmarks when a process runs.

The system can spin up significant background work right after install
and/or instrumentation start, and on locked devices with only two big
cores, there aren't enough CPUs to go around - warmup and benchmark
both complete before relevant JIT is complete.

Now, we bump the priority of both the benchmark and JIT thread.
Tracing benchmarks show that the JIT thread goes much faster, which
should significantly reduce the chance we capture results on unjitted
code.

This may also motivate us to use CPU affinity + locked small cores in
the future, we can keep monitoring.

Test: ./gradlew benchmark:b-c:cC
Test: ./gradlew benchmark:b-b:cC
Test: ./gradlew recyclerview:r-b:cC

This CL also adds more logging, and unifies all logging under
"benchmark" tag. This logging was very useful in discovering and
diagnosing the priority problem, since it showed the edge cases where
jit finished *during* the measure pass.

Bug: 140773023
Bug: 142058671

Change-Id: If542e3cb8867165cf7b4688090ee534e68a23562

M benchmark/common/src/androidTest/java/androidx/benchmark/BenchmarkStateTest.kt
M benchmark/common/src/main/java/androidx/benchmark/BenchmarkState.kt
A benchmark/common/src/main/java/androidx/benchmark/ThreadPriority.kt
M benchmark/common/src/main/java/androidx/benchmark/WarmupManager.kt
M benchmark/junit4/src/main/java/androidx/benchmark/junit4/BenchmarkRule.kt

https://android-review.googlesource.com/1138018

https://goto.google.com/android-sha1/345f86d34f7e373f464c0d8185e392b067a2de4a

cc...@google.com <cc...@google.com> #5Apr 13, 2020 08:27PM

Reassigned to ow...@google.com.

As part of trying to reland Owen's looping arch change, I verified that it does significantly improve this problem, see attached .json files - specifically the measured numbers.

E.g. Parameterized benchmark, before (both variants):
"runs": [
18,
12,
12,
12,
12,
"runs": [
17,
5,
5,
6,
5,

After:
"runs": [
14,
11,
11,
11,
11,
"runs": [
4,
4,
4,
4,
4,

Or TrivialJavaBenchmark, Before:
"runs": [
24,
12,
12,
12,
12,
After:

runs": [
11,
11,
11,
11,
11,

Let's mark this fixed once the warmup rearch lands again.

after.json

16 KB

Download

before.json

16 KB

Download

cc...@google.com <cc...@google.com> #6Apr 14, 2020 03:41PM

Marked as fixed.

In addition, we've lowered back down our measurements for our smallest (noop) benchmarks. Looks like making warmup and measurement more similar means we've fully jitted a lot more code. Since tiny benchmarks measure quickly, we were likely not giving the code in measurement codepaths time to jit.

I'd guess much of the remaining cost for the first loop is likely branch mispredictions for the loop early return itself (used only during measurement), but that only seems to only significantly affect the first benchmark (the first 'after' number above in ParameterizedBenchmark).

Tue Apr 14 2020 08:36:36 GMT-0700 (Pacific Daylight Time).png

183 KB

View

Download

Issue 142058671

Description

Issue summary

Comments

ap...@google.com <ap...@google.com> #2Oct 8, 2019 04:37PM

ap...@google.com <ap...@google.com> #3Oct 8, 2019 11:27PM

ap...@google.com <ap...@google.com> #4Oct 11, 2019 07:22PM

cc...@google.com <cc...@google.com> #5Apr 13, 2020 08:27PM

cc...@google.com <cc...@google.com> #6Apr 14, 2020 03:41PM

Add comment

Issue metadata