Bug P3
Status Update
Comments
jo...@google.com <jo...@google.com> #2
I have also tested config 4 using --disable-sandbox and it made no difference.
/usr/bin/crosvm run --cpus 56 --mem 127721 --rwdisk /mnt/stateful_partition/debian-10-2-0.qcow2 --disable-sandbox -p "root=/dev/vda1" /run/imageloader/cros-termina/13729.82.0/vm_kernel 2>/dev/null
/usr/bin/crosvm run --cpus 56 --mem 127721 --rwdisk /mnt/stateful_partition/debian-10-2-0.qcow2 --disable-sandbox -p "root=/dev/vda1" /run/imageloader/cros-termina/13729.82.0/vm_kernel 2>/dev/null
jo...@google.com <jo...@google.com> #3
For reference, concierge spawns crosvm with args:
/usr/bin/crosvm run --cpus 56 --mem 127721 --tap-fd 18 --cid 33 --socket /run/vm/vm.zRnunN/crosvm.sock --wayland-sock /run/chrome/wayland-0 --serial hardware=serial,num=1,earlycon=true,type=unix,path=/run/daemon-store/crosvm/ccc8a25c8f59055eaddfda422b9cccf68d873f7d/log/dGVybWluYQ==.lsock --serial hardware=virtio-console,num=1,console=true,type=unix,path=/run/daemon-store/crosvm/ccc8a25c8f59055eaddfda422b9cccf68d873f7d/log/dGVybWluYQ==.lsock --syslog-tag VM(33) --no-smt --pmem-device /run/imageloader/cros-termina/13729.82.0/vm_rootfs.img --params root=/dev/pmem0 ro --ac97 backend=cras --disk /run/imageloader/cros-termina/13729.82.0/vm_tools.img --rwdisk /run/daemon-store/crosvm/ccc8a25c8f59055eaddfda422b9cccf68d873f7d/dGVybWluYQ==.qcow2,sparse=true /run/imageloader/cros-termina/13729.82.0/vm_kernel
/usr/bin/crosvm run --cpus 56 --mem 127721 --tap-fd 18 --cid 33 --socket /run/vm/vm.zRnunN/crosvm.sock --wayland-sock /run/chrome/wayland-0 --serial hardware=serial,num=1,earlycon=true,type=unix,path=/run/daemon-store/crosvm/ccc8a25c8f59055eaddfda422b9cccf68d873f7d/log/dGVybWluYQ==.lsock --serial hardware=virtio-console,num=1,console=true,type=unix,path=/run/daemon-store/crosvm/ccc8a25c8f59055eaddfda422b9cccf68d873f7d/log/dGVybWluYQ==.lsock --syslog-tag VM(33) --no-smt --pmem-device /run/imageloader/cros-termina/13729.82.0/vm_rootfs.img --params root=/dev/pmem0 ro --ac97 backend=cras --disk /run/imageloader/cros-termina/13729.82.0/vm_tools.img --rwdisk /run/daemon-store/crosvm/ccc8a25c8f59055eaddfda422b9cccf68d873f7d/dGVybWluYQ==.qcow2,sparse=true /run/imageloader/cros-termina/13729.82.0/vm_kernel
dv...@google.com <dv...@google.com> #4
joelhockey@ confirmed that performance is better when not enabling core scheduling for vCPU threads in crosvm (matching non-Chrome OS host kernels that don't yet have the core scheduling ioctl).
joelaf@ - I wanted to check my understanding of our current core scheduling configuration and see if we could tweak it for improved performance without sacrificing the security/privacy/... guarantees we want from it.
As far as I can tell, we configure each vCPU thread with a separate core scheduling cookie here:https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/crosvm/sys_util/src/sched.rs;l=89;drc=026f72f9fb6b47f4f45a42e791f6faf7785105db - based on the documentation, this makes it sound like no vCPU threads can run simultaneously on the same core/HT pair, even two vCPUs from the same VM.
I may be missing some details, but it seems like we should be able to create a single core scheduling cookie and set all vCPU threads for the same VM to use that same cookie, which would still prevent other threads (including vCPUs from other VMs) from running simultaneously on the same core, but should allow multiple vCPUs from the same VM to run on the same core.
Would that be acceptable, or do we need the stronger guarantee that even two vCPUs from the same guest VM can't be scheduled simultaneously on the same core?
joelaf@ - I wanted to check my understanding of our current core scheduling configuration and see if we could tweak it for improved performance without sacrificing the security/privacy/... guarantees we want from it.
As far as I can tell, we configure each vCPU thread with a separate core scheduling cookie here:
I may be missing some details, but it seems like we should be able to create a single core scheduling cookie and set all vCPU threads for the same VM to use that same cookie, which would still prevent other threads (including vCPUs from other VMs) from running simultaneously on the same core, but should allow multiple vCPUs from the same VM to run on the same core.
Would that be acceptable, or do we need the stronger guarantee that even two vCPUs from the same guest VM can't be scheduled simultaneously on the same core?
jo...@google.com <jo...@google.com> #5
Hi Daniel,
Your understanding is correct. Yes, I think setting the cookie per-VM instead of per-vCPU should be fine but to be frank, the security team has to take this call. I believe most people in the industry do what you're suggesting - but in the industry most people run VMs in the cloud. In such situation, the host doesn't care if the attacks happen within the VM because the VM is of the user (they can do core scheduling inside the VM if they so choose to). In our case, It really depends on if we are comfortable with any attacker code inside a VM running on one vCPU, to sniff data from another vCPU in the same VM through micro architectural buffers. Since unlike the cloud, the VM is ours itself, we may or may not be Ok with this.
Your understanding is correct. Yes, I think setting the cookie per-VM instead of per-vCPU should be fine but to be frank, the security team has to take this call. I believe most people in the industry do what you're suggesting - but in the industry most people run VMs in the cloud. In such situation, the host doesn't care if the attacks happen within the VM because the VM is of the user (they can do core scheduling inside the VM if they so choose to). In our case, It really depends on if we are comfortable with any attacker code inside a VM running on one vCPU, to sniff data from another vCPU in the same VM through micro architectural buffers. Since unlike the cloud, the VM is ours itself, we may or may not be Ok with this.
jo...@google.com <jo...@google.com> #6
+mnissler, +jorgelo from security team:
Mattias and Jorge, would this change be OK?
Mattias and Jorge, would this change be OK?
si...@google.com <si...@google.com> #7
You should file a bug about this per go/chromeos-security-consultation rather then just ccing people
jo...@google.com <jo...@google.com> #8
Thanks Fergus - good idea. b/194022819
ab...@google.com <ab...@google.com> #9
We plan to work and re-evaluate this in the future.
gi...@appspot.gserviceaccount.com <gi...@appspot.gserviceaccount.com> #10
The following revision refers to this bug:
https://chromium.googlesource.com/chromium/src/+/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f
commit a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f
Author: Joel Hockey <joelhockey@chromium.org>
Date: Thu Aug 12 22:22:42 2021
Add Core Scheduling helper functions
Adds IsCoreSchedulingAvailable() and
NumberOfProcessorsForCoreScheduling(). These functions will be used to
set the --cpu count for crosvm.
http://go/crosvm-cpu-count
These functions were partially implemented early for arcvm, and are
now moved to a central place where they can be shared with crostini
or other VMs.
Bug: 1228565
Change-Id: I2c5793aac2b0079b7af964dcf9bcf1a207db6950
Reviewed-on:https://chromium-review.googlesource.com/c/chromium/src/+/3063740
Reviewed-by: Lei Zhang <thestig@chromium.org>
Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
Reviewed-by: Brian Geffon <bgeffon@chromium.org>
Reviewed-by: Yusuke Sato <yusukes@chromium.org>
Commit-Queue: Joel Hockey <joelhockey@chromium.org>
Cr-Commit-Position: refs/heads/master@{#911518}
[modify]https://crrev.com/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f/base/threading/thread_restrictions.h
[modify]https://crrev.com/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f/chromeos/system/core_scheduling.cc
[modify]https://crrev.com/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f/chromeos/system/core_scheduling.h
[modify]https://crrev.com/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f/components/arc/arc_util.cc
[modify]https://crrev.com/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f/components/arc/arc_util.h
[modify]https://crrev.com/a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f/components/arc/session/arc_vm_client_adapter.cc
commit a4f1dd19fa6f7fee0cc7c69cfa23be577cdd815f
Author: Joel Hockey <joelhockey@chromium.org>
Date: Thu Aug 12 22:22:42 2021
Add Core Scheduling helper functions
Adds IsCoreSchedulingAvailable() and
NumberOfProcessorsForCoreScheduling(). These functions will be used to
set the --cpu count for crosvm.
These functions were partially implemented early for arcvm, and are
now moved to a central place where they can be shared with crostini
or other VMs.
Bug: 1228565
Change-Id: I2c5793aac2b0079b7af964dcf9bcf1a207db6950
Reviewed-on:
Reviewed-by: Lei Zhang <thestig@chromium.org>
Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
Reviewed-by: Brian Geffon <bgeffon@chromium.org>
Reviewed-by: Yusuke Sato <yusukes@chromium.org>
Commit-Queue: Joel Hockey <joelhockey@chromium.org>
Cr-Commit-Position: refs/heads/master@{#911518}
[modify]
[modify]
[modify]
[modify]
[modify]
[modify]
[Deleted User] <[Deleted User]> #11
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.
Sorry for the inconvenience if the bug really should have been left as Available.
For more details visithttps://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Sorry for the inconvenience if the bug really should have been left as Available.
For more details visit
dt...@google.com <dt...@google.com> #12
When looking at the numbers I think we need to remember that there are only 28 cores on the host, 56 is number of threads with hyperthreading. Also IIRC Z840 is a 2-socket system, but the gusts is not aware of this. Maybe guest scheduler migrates tasks sub-optimally even if we do not consider the additional host load.
Description
The multi-cpu performance of crosvm runs slow when on a cros host.
I have been testing performance of crostini for compiling chromium, and there is a 2.5x slowdown on my 56-core Z840 when running crostini (Z840-cloudready89-crosvm-tatl) vs using the same hardware and compiling with ubuntu (Z840-ubuntu).
In order to isolate CPU performance, I have been running 7zip cpu benchmarking: `7z b -md22 -mmt=*`. This test runs using a fixed size compression dictionary (2^22 - I think 4MB?), and then using 3 different numbers of cores:
* single core (1)
* half available cores (n/2 = 28)
* all cores (n = 56)
I see good results for all configs except the last (crostini running on cros).
1/ Z840-ubuntu
2/ Z840-ubuntu-crosvm-tatl
3/ Z840-cloudready89
4/ Z840-cloudready89-crosvm-tatl (crostini)
All configs show ~4K MIPS for a single core.
Configs 1-3 show reasonable increase when increasing cores (num-cores => K-MIPS (is this BIPS?)):
1:28:56 => 4:85:115
When running crosvm-tatl on a cros host, the performance is more like:
1:28:56 => 4:50:45
The performance at 28 cores is about 60% of other configs. But then when using all 56 cores, it goes to about 40% of other configs and has less overall MIPS than at n/2 cores.
I have also tested configs similar to 1 and 2 using P920-glinux. I test config 1 with (sudo apt install p7zip-full):
7z b -md22 -mmt=*
I test config 2 using go/crosvm-linux-setup as a starting guide, and then using command below to start the VM:
sudo ~/chromiumos/src/platform/crosvm/target/debug/crosvm run --cpus 72 --mem 127721 --rwdisk ~/work/debian-10-2-0.qcow2 -p "root=/dev/vda1" --disable-sandbox ~/chromiumos/chroot/home/joelhockey/tatl/vm_kernel
I launched debian-10-2-0.qcow2 with virt-manager to install p7zip-full (sudo apt install p7zip-full) since I was too confused about how to set up network access with crosvm.
I test config 4 by either starting crostini and running commands via vsh, or I can also launch crosvm directly on cros using a different rwdisk:
/usr/bin/crosvm run --cpus 56 --mem 127721 --rwdisk /mnt/stateful_partition/debian-10-2-0.qcow2 -p "root=/dev/vda1" /run/imageloader/cros-termina/13729.82.0/vm_kernel
I have also run some tests on my 4-CPU eve device. I see that VM performance is worse than running directly in the host, but it is more like a 1.3x difference (rather than 2.5x difference I get for 56 CPUs) and I'm not seeing performance of n worse than n/2.