Status Update
Comments
ki...@google.com <ki...@google.com> #2
Hey, sorry for coming back to this so late. We've never seen this. Without any repro steps it's going to be hard to do anything about it.
We've some cases where the Gradle daemon gets started but Studio cannot connect to it and the Studio side Gradle code handling connection to the daemon will keep spawning new ones if the connection does not get established. In this instance though the fact that the build window shows all these daemon running means that this is a different problem.
be...@google.com <be...@google.com> #3
Hello, I had the issue reproduce on first project open after upgrading Android Studio from JetBrains Toolbox. Note that the project has a buildSrc.
I think you can reproduce it with the same steps if nothing changed since Arctic Fox alpha 08.
br...@google.com <br...@google.com> #4
When you upgraded from the toolbox, can you indicate which version you started from and what you upgraded to?
br...@google.com <br...@google.com> #5
Yes, actually it's in the title of the issue: Canary 7 to 8 of Arctic Fox.
ki...@google.com <ki...@google.com> #6
oops, thanks :)
br...@google.com <br...@google.com> #7
I just had that again when upgrading from AS BB Canary 2 to AS BB Canary 6. I am attaching screenshots and a screencast so you can see how it is materializing.
Note that for some reason, Gradle sync is complaining about JDK location while it was working perfectly on Canary 2 (and I only touched the AGP version), I don't know why and I don't plan to investigate that now, I'll just try to fix it and move it.
Description
network_WiFi_CSADisconnect is a good illustration for this bug.
The driver complains about stuck tx queues after issuing a CSA followed by disconnect. We still pass the test.
21:31:49 INFO | autoserv| 2018-04-28T21:31:26.455496-07:00 ERR kernel: [ 1756.996598] iwlwifi 0000:01:00.0: fail to flush all tx fifo queues Q 5
21:31:49 INFO | autoserv| 2018-04-28T21:31:26.455523-07:00 ERR kernel: [ 1756.996812] iwlwifi 0000:01:00.0: Queue 5 is active on fifo 3 and stuck for 10000 ms. SW [46, 47] HW [46, 47] FH TRB=0x08030502e
21:31:49 INFO | autoserv| 2018-04-28T21:31:38.705365-07:00 ERR kernel: [ 1769.247049] iwlwifi 0000:01:00.0: fail to flush all tx fifo queues Q 5
21:31:49 INFO | autoserv| 2018-04-28T21:31:38.705456-07:00 ERR kernel: [ 1769.247266] iwlwifi 0000:01:00.0: Queue 5 is active on fifo 3 and stuck for 10000 ms. SW [51, 52] HW [51, 52] FH TRB=0x080305033
The goal is to look for any and all kernel crashes in the wireless stack and wifi firmware crashes that happened during the test.
Fail the test if any are found.
While network_WiFi_CSADisconnect is an easy example to point to,
another goal here is to discover kernel warnings reported in plenty
on the crash server but hard to reproduce at-will e.g. iwl_trans_pcie_grab_nic_access / iwl_trans_pcie_reclaim warnings when
wifi drops off the PCI bus.
The closest example I could find was the use of client/cros/cros_logging.py, which reads dmesg on the client into a buffer and greps for a desired pattern.
[snip]
def verify_lvds_downclock(self):
"""On systems which support LVDS downclock, checks the kernel log for
a message that an LVDS downclock mode has been added."""
board = utils.get_board()
if not (board == 'alex' or board == 'lumpy' or board == 'stout'):
return ''
# Get the downclock message from the logs.
reader = cros_logging.LogReader()
reader.set_start_by_reboot(-1)
if not reader.can_find('Adding LVDS downclock mode'):
return self.handle_error('LVDS downclock quirk not applied. ')
return ''
[snip]
However, cros_logging.py is client-side only, it needs to be modified so that it can be used from a server-side test.
Another approach is at CL:577064 ("Add a wifi stress test"). It does this by calling unix (cat) utilities within the test. I've chosen to work on improving cros_logging.py as opposed to do what CL:577064 is doing as cros_logging.py can be then used by all tests.
Proposed solution:
Step 1: Extend cros_logging.py to work with remote hosts (i.e. clients) as well, instead of just locally. Follow the example of iw_runner.py
which does things in two possible ways, depending on whether self._host is None (local run) or not (calling from the server, want to run on the client).
Step 2: Move cros_logging.py to client/common_lib (since anyone can run it now).
Step 3: cros_logging.py allows setting a "start_line" in dmesg. so you only scan dmesg from the point where the test starts. Add a hook to wifi_client to set the start_line.
Step 4: Add the run_before_once and run_after_once hooks to wifi_cell_test_base. run_before_once lets wifi_client set the start_line and run_after_once is what checks for unwanted spew, and fails the test if any is found.