Status Update
Comments
cm...@google.com <cm...@google.com>
je...@google.com <je...@google.com>
ga...@google.com <ga...@google.com> #2
While I think this is an interesting approach, I believe it is addressing an actual problem at the wrong level of abstraction.
IMHO Gradle should allow for build operations (tasks/artifact transforms/worker actions) to specify CPU/memory requirements, and because it has a holistic view of the system it should be able to solve this problem in a much nicer way. Build operations can provide CPU/memory requirements dynamically, based on the inputs they need to process. Can you please file an issue on the Gradle issue tracker -
mc...@ebay.com <mc...@ebay.com> #3
That's a good call-out.
I agree that AGP is probably too high level to be able to address this alone. I fear that the Gradle team may not feel like they have enough knowledge about the work being executed to be able to effectively design an API at their level, especially with concepts such as compiler daemons in the mix. I suppose we'll let that issue simmer for a while and see what they come up with...
ga...@google.com <ga...@google.com> #4
Thanks for filing that bug with Gradle.
Compiler daemons, or any long-running processes are indeed interesting. If Gradle provides us with APIs, we can notify it about the system-wide memory usage when these daemons (e.g Kotlin/AAPT) are launched, and we can release those resources when daemons are stopped. Typically, these daemons are launched with predefined memory settings, so getting those should not be too hard. For tasks/artifact transforms, we can probably develop some per-task type heuristic: e.g Lint needs 1GB of heap per ~1000 source files (I'm just making this up), or R8 needs 500MB per 1000 program classes.
Until Gradle adds this API, I'm a bit reluctant to add parallelization limits for many task types, as their memory usage heavily depends on the inputs. Also, we can add this for AGP tasks only, but then what happens when there are other Gradle plugins with CPU/memory-intensive tasks? You'd still end up with some custom code to handle Kotlin, spotless, Ktlint etc.
TBH your approach in #1 does not seem to bad :) As long as AGP does not start minifying its classes (not any time soon), updating task type is very much trivial:
- I'd maintain a list of "unconstrained" and "constrained" task types, and use those to set up build
- your plugin can add an operation listener to get type for every task that runs in the build
- during execution, you can assert that the task type is one of the constrained or unconstrained => in such way any changes to AGP/KGP/other plugins will fail your build, and you can adjust your task type list in 1)
For some task types, decision is easier. E.g in android.r8.maxWorkers=1
to allow at most 1 R8 tasks to run in parallel (this is the default value; this does not prevent e.g Lint from running in parallel to R8 though). Also, in android.experimental.runLintInProcess
flag to run Lint out of process. In general, we are adding an option to move memory-intensive operations (R8, Lint) out of Gradle daemon, but that does not help if the entire system is resource constrained.
mc...@ebay.com <mc...@ebay.com> #5
Thanks for providing these additional details and context. Having exposed controls such as the properties you reference gives us control points which we can leverage in our dynamic tuning. It has given me a couple ideas for additional scaling possibilities. :)
The process you describe is similar to what we are doing at this point in response to AGP and Kotlin changes. We don't track unconstrained task types mainly due to their sheer volume - plus at this point it would require us to reference additional AGP- internal task types. The approach is a bit coarse and we haven't been able to fully get memory utilization under control, but it has certainly helped.
Description
Context:
We have a very large Android project with many project modules, many of which are independent of each other until integration at the application module level. We also have build/developer machines which vary widely in terms of system specs/capabilities. It has become difficult for us to scale the build. The most significant system resource from the perspective of build success/failure has tended to be memory, as running out of other resources tend to only slow the build (disk space not withstanding) whereas running out of memory crashes the build.
Our current approach is predicated upon knowing the concrete task types defined by AGP. Given AGP's roadmap to hide implementation details in favor of moving to an API artifact, this means that our ability to successfully implement our current solution will be reduced/removed.
Request:
Add an optional mechanism into AGP to allow memory-intensive tasks to have limited concurrency within a parallel build execution. The concurrency should be able to be determined by the project or command line invocation such that more powerful machines can more fully take advantage of their expanded system resources, whereas weaker machines can constrain the build more.
Our current approach:
We have authored build plugins that perform simple system probes and bucket the systems into simple categories of
CONSTRAINED
,NORMAL
, andUNCONSTRAINED
. The definition of this part is very project specific. For us, this currently is defined by something like:This is then used in conjunction with a build service which is being used as a simple semaphore:
This build service is then associated with the most intensive tasks by type:
This lets the cheap tasks continue in mass parallel execution - something that cannot be achieved by simply reducing worker count - whilst moderately serializing the expensive tasks, increasing the success rate of our build on lower end hardware for our builds.