Fixed
Status Update
Comments
cm...@google.com <cm...@google.com> #2
SDK Revision 5
hu...@google.com <hu...@google.com> #3
Successful workaround: Kill the ADB process if it does not return within 120 seconds.
vm...@google.com <vm...@google.com> #4
I've seen the same problem on rare occasion -- I believe on both 32-bit and 64-bit machines (all running Ubuntu).
This scenario is from an Amazon EC2 instance running Ubuntu 10.04.
== Long-running processes involved:
* PID 4440 adb (daemon)
* PID 5396 adb -s localhost:52384 install -r MyApp-debug.apk
* PID 5216 emulator -ports 38007,52384 -no-boot-anim -prop persist.sys.language=de -prop persist.sys.country=DE -avd hudson_de-DE_240_WVGA_android-8 -no-window
== Commands run:
* adb daemon was (presumably) already running
* emulator is started as above
* `adb connect localhost:52384`
* `adb logcat -v time` is left running in the background
* `adb install` below is run once emulator is ready
adb -s localhost:52384 install -r MyApp-debug.apk
116 KB/s (57642 bytes in 0.482s)
pkg: /data/local/tmp/MyApp-debug.apk
Success
** hangs here indefinitely without returning (despite "Success") **
== Investigation
* `ps` shows that the hung `adb install` process has PID 5396
root@ip-10-228-211-159:~# strace -p 5396
Process 5396 attached - interrupt to quit
read(4, ^C <unfinished ...>
Process 5396 detached
* adb is blocked reading from FD 4 -- what is that?
root@ip-10-228-211-159:~# ls -l /proc/5396/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:26 0 -> pipe:[135089]
l-wx------ 1 root root 64 2010-09-12 12:26 1 -> pipe:[135090]
l-wx------ 1 root root 64 2010-09-12 12:26 2 -> pipe:[135090]
lrwx------ 1 root root 64 2010-09-12 12:26 3 -> socket:[135142]
lrwx------ 1 root root 64 2010-09-12 12:26 4 -> socket:[136177]
root@ip-10-228-211-159:~# lsof | grep 136177
adb 5396 root 4u IPv4 136177 0t0 TCP localhost:51221->localhost:5037 (ESTABLISHED)
root@ip-10-228-211-159:~# netstat -antp | grep LISTEN
tcp 0 0127.0.0.1:5037 0.0.0.0:* LISTEN 4440/adb
tcp 0 00.0.0.0:22 0.0.0.0:* LISTEN 406/sshd
tcp 0 0127.0.0.1:38007 0.0.0.0:* LISTEN 5216/emulator
tcp 0 0127.0.0.1:52384 0.0.0.0:* LISTEN 5216/emulator
* So we're waiting to read from port 5037 -- the adb daemon (PID 4440)?
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:28:55.909470 select(26, [4 5 14 18 25], [], [], NULL^C <unfinished ...>
Process 4440 detached
* It's blocking in select(), waiting for FD 26 to become ready. What's that?
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:27 0 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 1 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 14 -> socket:[133791]
lrwx------ 1 root root 64 2010-09-12 12:27 15 -> socket:[133790]
lrwx------ 1 root root 64 2010-09-12 12:27 16 -> socket:[133792]
lrwx------ 1 root root 64 2010-09-12 12:27 18 -> socket:[133949]
l-wx------ 1 root root 64 2010-09-12 12:27 2 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 25 -> socket:[136178]
lrwx------ 1 root root 64 2010-09-12 12:27 3 -> socket:[127504]
lrwx------ 1 root root 64 2010-09-12 12:27 4 -> socket:[127505]
lrwx------ 1 root root 64 2010-09-12 12:27 5 -> socket:[127506]
lr-x------ 1 root root 64 2010-09-12 12:27 6 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 7 -> /tmp/adb.log
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/26
ls: cannot access /proc/4440/fd/26: No such file or directory
* Hmm.. it doesn't exist?
* Ok, let's see what the server does when we manually kill the `adb install` process (PID 5396), at time 12:32:50:
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:32:43.366720 select(26, [4 5 14 18 25], [], [], NULL) = 1 (in [25]) <7.274112>
12:32:50.641019 read(25, "", 4096) = 0 <0.000015>
12:32:50.641596 write(14, "\20R\6\t", 4) = 4 <0.000061>
12:32:50.641711 close(25) = 0 <0.000058>
12:32:50.641805 select(26, [4 5 14 18], [], [], NULL) = 1 (in [14]) <0.002506>
12:32:50.644385 read(14, "\20r\6\t", 4) = 4 <0.000025>
12:32:50.644470 select(26, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.281615>
12:32:50.926206 accept(5, {sa_family=AF_INET, sin_port=htons(44907), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000043>
12:32:50.926370 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000030>
12:32:50.926477 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000053>
12:32:50.926651 select(26, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000015>
12:32:50.926727 read(8, "000chost:version", 4096) = 16 <0.000020>
12:32:50.926787 read(8, 0x9067244, 4080) = -1 EAGAIN (Resource temporarily unavailable) <0.000013>
12:32:50.926860 write(8, "OKAY0004001a", 12) = 12 <0.000037>
12:32:50.926943 close(8) = 0 <0.000053>
12:32:50.927031 select(19, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.000037>
12:32:50.927144 accept(5, {sa_family=AF_INET, sin_port=htons(44908), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000035>
12:32:50.927251 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000029>
12:32:50.927334 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000029>
12:32:50.927433 select(19, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000030>
12:32:50.927526 read(8, "001fhost:disconnect:localhost:52"..., 4096) = 35 <0.000033>
12:32:50.927623 read(8, 0x9067257, 4061) = -1 EAGAIN (Resource temporarily unavailable) <0.000028>
12:32:50.927729 shutdown(15, 2 /* send and receive */) = 0 <0.000053>
12:32:50.927825 close(15) = 0 <0.000028>
12:32:50.934829 close(16) = 0 <0.000062>
12:32:50.934959 write(3, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000036>
12:32:50.935064 write(8, "OKAY0000", 8) = 8 <0.000048>
12:32:50.935154 close(8) = 0 <0.000032>
12:32:50.935233 select(19, [4 5 14 18], [], [], NULL) = 2 (in [4 14]) <0.000015>
12:32:50.935290 read(4, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000141>
12:32:50.935470 close(14PANIC: attached pid 4440 exited with 255
<unfinished ... exit status 255>
* Not so good.
This scenario is from an Amazon EC2 instance running Ubuntu 10.04.
== Long-running processes involved:
* PID 4440 adb (daemon)
* PID 5396 adb -s localhost:52384 install -r MyApp-debug.apk
* PID 5216 emulator -ports 38007,52384 -no-boot-anim -prop persist.sys.language=de -prop persist.sys.country=DE -avd hudson_de-DE_240_WVGA_android-8 -no-window
== Commands run:
* adb daemon was (presumably) already running
* emulator is started as above
* `adb connect localhost:52384`
* `adb logcat -v time` is left running in the background
* `adb install` below is run once emulator is ready
adb -s localhost:52384 install -r MyApp-debug.apk
116 KB/s (57642 bytes in 0.482s)
pkg: /data/local/tmp/MyApp-debug.apk
Success
** hangs here indefinitely without returning (despite "Success") **
== Investigation
* `ps` shows that the hung `adb install` process has PID 5396
root@ip-10-228-211-159:~# strace -p 5396
Process 5396 attached - interrupt to quit
read(4, ^C <unfinished ...>
Process 5396 detached
* adb is blocked reading from FD 4 -- what is that?
root@ip-10-228-211-159:~# ls -l /proc/5396/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:26 0 -> pipe:[135089]
l-wx------ 1 root root 64 2010-09-12 12:26 1 -> pipe:[135090]
l-wx------ 1 root root 64 2010-09-12 12:26 2 -> pipe:[135090]
lrwx------ 1 root root 64 2010-09-12 12:26 3 -> socket:[135142]
lrwx------ 1 root root 64 2010-09-12 12:26 4 -> socket:[136177]
root@ip-10-228-211-159:~# lsof | grep 136177
adb 5396 root 4u IPv4 136177 0t0 TCP localhost:51221->localhost:5037 (ESTABLISHED)
root@ip-10-228-211-159:~# netstat -antp | grep LISTEN
tcp 0 0
tcp 0 0
tcp 0 0
tcp 0 0
* So we're waiting to read from port 5037 -- the adb daemon (PID 4440)?
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:28:55.909470 select(26, [4 5 14 18 25], [], [], NULL^C <unfinished ...>
Process 4440 detached
* It's blocking in select(), waiting for FD 26 to become ready. What's that?
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:27 0 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 1 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 14 -> socket:[133791]
lrwx------ 1 root root 64 2010-09-12 12:27 15 -> socket:[133790]
lrwx------ 1 root root 64 2010-09-12 12:27 16 -> socket:[133792]
lrwx------ 1 root root 64 2010-09-12 12:27 18 -> socket:[133949]
l-wx------ 1 root root 64 2010-09-12 12:27 2 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 25 -> socket:[136178]
lrwx------ 1 root root 64 2010-09-12 12:27 3 -> socket:[127504]
lrwx------ 1 root root 64 2010-09-12 12:27 4 -> socket:[127505]
lrwx------ 1 root root 64 2010-09-12 12:27 5 -> socket:[127506]
lr-x------ 1 root root 64 2010-09-12 12:27 6 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 7 -> /tmp/adb.log
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/26
ls: cannot access /proc/4440/fd/26: No such file or directory
* Hmm.. it doesn't exist?
* Ok, let's see what the server does when we manually kill the `adb install` process (PID 5396), at time 12:32:50:
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:32:43.366720 select(26, [4 5 14 18 25], [], [], NULL) = 1 (in [25]) <7.274112>
12:32:50.641019 read(25, "", 4096) = 0 <0.000015>
12:32:50.641596 write(14, "\20R\6\t", 4) = 4 <0.000061>
12:32:50.641711 close(25) = 0 <0.000058>
12:32:50.641805 select(26, [4 5 14 18], [], [], NULL) = 1 (in [14]) <0.002506>
12:32:50.644385 read(14, "\20r\6\t", 4) = 4 <0.000025>
12:32:50.644470 select(26, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.281615>
12:32:50.926206 accept(5, {sa_family=AF_INET, sin_port=htons(44907), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000043>
12:32:50.926370 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000030>
12:32:50.926477 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000053>
12:32:50.926651 select(26, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000015>
12:32:50.926727 read(8, "000chost:version", 4096) = 16 <0.000020>
12:32:50.926787 read(8, 0x9067244, 4080) = -1 EAGAIN (Resource temporarily unavailable) <0.000013>
12:32:50.926860 write(8, "OKAY0004001a", 12) = 12 <0.000037>
12:32:50.926943 close(8) = 0 <0.000053>
12:32:50.927031 select(19, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.000037>
12:32:50.927144 accept(5, {sa_family=AF_INET, sin_port=htons(44908), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000035>
12:32:50.927251 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000029>
12:32:50.927334 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000029>
12:32:50.927433 select(19, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000030>
12:32:50.927526 read(8, "001fhost:disconnect:localhost:52"..., 4096) = 35 <0.000033>
12:32:50.927623 read(8, 0x9067257, 4061) = -1 EAGAIN (Resource temporarily unavailable) <0.000028>
12:32:50.927729 shutdown(15, 2 /* send and receive */) = 0 <0.000053>
12:32:50.927825 close(15) = 0 <0.000028>
12:32:50.934829 close(16) = 0 <0.000062>
12:32:50.934959 write(3, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000036>
12:32:50.935064 write(8, "OKAY0000", 8) = 8 <0.000048>
12:32:50.935154 close(8) = 0 <0.000032>
12:32:50.935233 select(19, [4 5 14 18], [], [], NULL) = 2 (in [4 14]) <0.000015>
12:32:50.935290 read(4, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000141>
12:32:50.935470 close(14PANIC: attached pid 4440 exited with 255
<unfinished ... exit status 255>
* Not so good.
Description
-----------
It is easy to see code written in this form: SomeClass.class.getResourceAsStream(someResource) (see
What's wrong with Class.getResourceAsStream()?
-------------------------------------------------------------------
Calling Class.getResourceAsStream() is roughly equivalent to calling Class.getClassLoader().getResourceAsStream().
The implementation of ClassLoader.getResourcesAsStream() is probably safe, but when the class loader is a URLClassLoader, then its overridden version URLClassLoader.getResourceAsStream() has a bug:
This bug can manifest when two subprojects are building in parallel in separate class loaders, and both call URLClassLoader.getResourceAsStream(). One subproject is finished first, and the corresponding class loader is closed. However, closing the class loader will also close the underlying jar file that the other project might be using when reading the stream returned by URLClassLoader.getResourceAsStream(). (This is what happened in
Note: Calling Class.getResourceAsStream() is okay if Class.getClassLoader() is not a URLClassLoader, and therefore does not suffer from this bug. However, because we usually don't check the ClassLoader type, it is better to simply avoid calling Class.getResourceAsStream().
What's the alternative?
------------------------------
One alternative is to use the default implementation of ClassLoader.getResourcesAsStream(), which is Class.getResource(someResource).openStream().
I've verified locally that using this alternative, closing the class loader doesn't close the underlying jar file, and it should be thread safe. (It fixes
I'm opening this bug to revise the code in AGP, data binding, and Jetifier with this new understanding.