Fixed
Status Update
Comments
ap...@google.com <ap...@google.com> #2
SDK Revision 5
ch...@google.com <ch...@google.com> #3
Successful workaround: Kill the ADB process if it does not return within 120 seconds.
na...@google.com <na...@google.com> #4
I've seen the same problem on rare occasion -- I believe on both 32-bit and 64-bit machines (all running Ubuntu).
This scenario is from an Amazon EC2 instance running Ubuntu 10.04.
== Long-running processes involved:
* PID 4440 adb (daemon)
* PID 5396 adb -s localhost:52384 install -r MyApp-debug.apk
* PID 5216 emulator -ports 38007,52384 -no-boot-anim -prop persist.sys.language=de -prop persist.sys.country=DE -avd hudson_de-DE_240_WVGA_android-8 -no-window
== Commands run:
* adb daemon was (presumably) already running
* emulator is started as above
* `adb connect localhost:52384`
* `adb logcat -v time` is left running in the background
* `adb install` below is run once emulator is ready
adb -s localhost:52384 install -r MyApp-debug.apk
116 KB/s (57642 bytes in 0.482s)
pkg: /data/local/tmp/MyApp-debug.apk
Success
** hangs here indefinitely without returning (despite "Success") **
== Investigation
* `ps` shows that the hung `adb install` process has PID 5396
root@ip-10-228-211-159:~# strace -p 5396
Process 5396 attached - interrupt to quit
read(4, ^C <unfinished ...>
Process 5396 detached
* adb is blocked reading from FD 4 -- what is that?
root@ip-10-228-211-159:~# ls -l /proc/5396/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:26 0 -> pipe:[135089]
l-wx------ 1 root root 64 2010-09-12 12:26 1 -> pipe:[135090]
l-wx------ 1 root root 64 2010-09-12 12:26 2 -> pipe:[135090]
lrwx------ 1 root root 64 2010-09-12 12:26 3 -> socket:[135142]
lrwx------ 1 root root 64 2010-09-12 12:26 4 -> socket:[136177]
root@ip-10-228-211-159:~# lsof | grep 136177
adb 5396 root 4u IPv4 136177 0t0 TCP localhost:51221->localhost:5037 (ESTABLISHED)
root@ip-10-228-211-159:~# netstat -antp | grep LISTEN
tcp 0 0127.0.0.1:5037 0.0.0.0:* LISTEN 4440/adb
tcp 0 00.0.0.0:22 0.0.0.0:* LISTEN 406/sshd
tcp 0 0127.0.0.1:38007 0.0.0.0:* LISTEN 5216/emulator
tcp 0 0127.0.0.1:52384 0.0.0.0:* LISTEN 5216/emulator
* So we're waiting to read from port 5037 -- the adb daemon (PID 4440)?
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:28:55.909470 select(26, [4 5 14 18 25], [], [], NULL^C <unfinished ...>
Process 4440 detached
* It's blocking in select(), waiting for FD 26 to become ready. What's that?
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:27 0 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 1 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 14 -> socket:[133791]
lrwx------ 1 root root 64 2010-09-12 12:27 15 -> socket:[133790]
lrwx------ 1 root root 64 2010-09-12 12:27 16 -> socket:[133792]
lrwx------ 1 root root 64 2010-09-12 12:27 18 -> socket:[133949]
l-wx------ 1 root root 64 2010-09-12 12:27 2 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 25 -> socket:[136178]
lrwx------ 1 root root 64 2010-09-12 12:27 3 -> socket:[127504]
lrwx------ 1 root root 64 2010-09-12 12:27 4 -> socket:[127505]
lrwx------ 1 root root 64 2010-09-12 12:27 5 -> socket:[127506]
lr-x------ 1 root root 64 2010-09-12 12:27 6 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 7 -> /tmp/adb.log
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/26
ls: cannot access /proc/4440/fd/26: No such file or directory
* Hmm.. it doesn't exist?
* Ok, let's see what the server does when we manually kill the `adb install` process (PID 5396), at time 12:32:50:
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:32:43.366720 select(26, [4 5 14 18 25], [], [], NULL) = 1 (in [25]) <7.274112>
12:32:50.641019 read(25, "", 4096) = 0 <0.000015>
12:32:50.641596 write(14, "\20R\6\t", 4) = 4 <0.000061>
12:32:50.641711 close(25) = 0 <0.000058>
12:32:50.641805 select(26, [4 5 14 18], [], [], NULL) = 1 (in [14]) <0.002506>
12:32:50.644385 read(14, "\20r\6\t", 4) = 4 <0.000025>
12:32:50.644470 select(26, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.281615>
12:32:50.926206 accept(5, {sa_family=AF_INET, sin_port=htons(44907), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000043>
12:32:50.926370 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000030>
12:32:50.926477 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000053>
12:32:50.926651 select(26, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000015>
12:32:50.926727 read(8, "000chost:version", 4096) = 16 <0.000020>
12:32:50.926787 read(8, 0x9067244, 4080) = -1 EAGAIN (Resource temporarily unavailable) <0.000013>
12:32:50.926860 write(8, "OKAY0004001a", 12) = 12 <0.000037>
12:32:50.926943 close(8) = 0 <0.000053>
12:32:50.927031 select(19, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.000037>
12:32:50.927144 accept(5, {sa_family=AF_INET, sin_port=htons(44908), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000035>
12:32:50.927251 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000029>
12:32:50.927334 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000029>
12:32:50.927433 select(19, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000030>
12:32:50.927526 read(8, "001fhost:disconnect:localhost:52"..., 4096) = 35 <0.000033>
12:32:50.927623 read(8, 0x9067257, 4061) = -1 EAGAIN (Resource temporarily unavailable) <0.000028>
12:32:50.927729 shutdown(15, 2 /* send and receive */) = 0 <0.000053>
12:32:50.927825 close(15) = 0 <0.000028>
12:32:50.934829 close(16) = 0 <0.000062>
12:32:50.934959 write(3, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000036>
12:32:50.935064 write(8, "OKAY0000", 8) = 8 <0.000048>
12:32:50.935154 close(8) = 0 <0.000032>
12:32:50.935233 select(19, [4 5 14 18], [], [], NULL) = 2 (in [4 14]) <0.000015>
12:32:50.935290 read(4, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000141>
12:32:50.935470 close(14PANIC: attached pid 4440 exited with 255
<unfinished ... exit status 255>
* Not so good.
This scenario is from an Amazon EC2 instance running Ubuntu 10.04.
== Long-running processes involved:
* PID 4440 adb (daemon)
* PID 5396 adb -s localhost:52384 install -r MyApp-debug.apk
* PID 5216 emulator -ports 38007,52384 -no-boot-anim -prop persist.sys.language=de -prop persist.sys.country=DE -avd hudson_de-DE_240_WVGA_android-8 -no-window
== Commands run:
* adb daemon was (presumably) already running
* emulator is started as above
* `adb connect localhost:52384`
* `adb logcat -v time` is left running in the background
* `adb install` below is run once emulator is ready
adb -s localhost:52384 install -r MyApp-debug.apk
116 KB/s (57642 bytes in 0.482s)
pkg: /data/local/tmp/MyApp-debug.apk
Success
** hangs here indefinitely without returning (despite "Success") **
== Investigation
* `ps` shows that the hung `adb install` process has PID 5396
root@ip-10-228-211-159:~# strace -p 5396
Process 5396 attached - interrupt to quit
read(4, ^C <unfinished ...>
Process 5396 detached
* adb is blocked reading from FD 4 -- what is that?
root@ip-10-228-211-159:~# ls -l /proc/5396/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:26 0 -> pipe:[135089]
l-wx------ 1 root root 64 2010-09-12 12:26 1 -> pipe:[135090]
l-wx------ 1 root root 64 2010-09-12 12:26 2 -> pipe:[135090]
lrwx------ 1 root root 64 2010-09-12 12:26 3 -> socket:[135142]
lrwx------ 1 root root 64 2010-09-12 12:26 4 -> socket:[136177]
root@ip-10-228-211-159:~# lsof | grep 136177
adb 5396 root 4u IPv4 136177 0t0 TCP localhost:51221->localhost:5037 (ESTABLISHED)
root@ip-10-228-211-159:~# netstat -antp | grep LISTEN
tcp 0 0
tcp 0 0
tcp 0 0
tcp 0 0
* So we're waiting to read from port 5037 -- the adb daemon (PID 4440)?
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:28:55.909470 select(26, [4 5 14 18 25], [], [], NULL^C <unfinished ...>
Process 4440 detached
* It's blocking in select(), waiting for FD 26 to become ready. What's that?
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/
total 0
lr-x------ 1 root root 64 2010-09-12 12:27 0 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 1 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 14 -> socket:[133791]
lrwx------ 1 root root 64 2010-09-12 12:27 15 -> socket:[133790]
lrwx------ 1 root root 64 2010-09-12 12:27 16 -> socket:[133792]
lrwx------ 1 root root 64 2010-09-12 12:27 18 -> socket:[133949]
l-wx------ 1 root root 64 2010-09-12 12:27 2 -> /tmp/adb.log
lrwx------ 1 root root 64 2010-09-12 12:27 25 -> socket:[136178]
lrwx------ 1 root root 64 2010-09-12 12:27 3 -> socket:[127504]
lrwx------ 1 root root 64 2010-09-12 12:27 4 -> socket:[127505]
lrwx------ 1 root root 64 2010-09-12 12:27 5 -> socket:[127506]
lr-x------ 1 root root 64 2010-09-12 12:27 6 -> /dev/null
l-wx------ 1 root root 64 2010-09-12 12:27 7 -> /tmp/adb.log
root@ip-10-228-211-159:~# ls -l /proc/4440/fd/26
ls: cannot access /proc/4440/fd/26: No such file or directory
* Hmm.. it doesn't exist?
* Ok, let's see what the server does when we manually kill the `adb install` process (PID 5396), at time 12:32:50:
root@ip-10-228-211-159:~# strace -tt -T -p 4440
Process 4440 attached - interrupt to quit
12:32:43.366720 select(26, [4 5 14 18 25], [], [], NULL) = 1 (in [25]) <7.274112>
12:32:50.641019 read(25, "", 4096) = 0 <0.000015>
12:32:50.641596 write(14, "\20R\6\t", 4) = 4 <0.000061>
12:32:50.641711 close(25) = 0 <0.000058>
12:32:50.641805 select(26, [4 5 14 18], [], [], NULL) = 1 (in [14]) <0.002506>
12:32:50.644385 read(14, "\20r\6\t", 4) = 4 <0.000025>
12:32:50.644470 select(26, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.281615>
12:32:50.926206 accept(5, {sa_family=AF_INET, sin_port=htons(44907), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000043>
12:32:50.926370 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000030>
12:32:50.926477 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000053>
12:32:50.926651 select(26, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000015>
12:32:50.926727 read(8, "000chost:version", 4096) = 16 <0.000020>
12:32:50.926787 read(8, 0x9067244, 4080) = -1 EAGAIN (Resource temporarily unavailable) <0.000013>
12:32:50.926860 write(8, "OKAY0004001a", 12) = 12 <0.000037>
12:32:50.926943 close(8) = 0 <0.000053>
12:32:50.927031 select(19, [4 5 14 18], [], [], NULL) = 1 (in [5]) <0.000037>
12:32:50.927144 accept(5, {sa_family=AF_INET, sin_port=htons(44908), sin_addr=inet_addr("127.0.0.1")}, [16]) = 8 <0.000035>
12:32:50.927251 setsockopt(8, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 <0.000029>
12:32:50.927334 fcntl64(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 <0.000029>
12:32:50.927433 select(19, [4 5 8 14 18], [], [], NULL) = 1 (in [8]) <0.000030>
12:32:50.927526 read(8, "001fhost:disconnect:localhost:52"..., 4096) = 35 <0.000033>
12:32:50.927623 read(8, 0x9067257, 4061) = -1 EAGAIN (Resource temporarily unavailable) <0.000028>
12:32:50.927729 shutdown(15, 2 /* send and receive */) = 0 <0.000053>
12:32:50.927825 close(15) = 0 <0.000028>
12:32:50.934829 close(16) = 0 <0.000062>
12:32:50.934959 write(3, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000036>
12:32:50.935064 write(8, "OKAY0000", 8) = 8 <0.000048>
12:32:50.935154 close(8) = 0 <0.000032>
12:32:50.935233 select(19, [4 5 14 18], [], [], NULL) = 2 (in [4 14]) <0.000015>
12:32:50.935290 read(4, "\260\237\5\t\0\0\0\0", 8) = 8 <0.000141>
12:32:50.935470 close(14PANIC: attached pid 4440 exited with 255
<unfinished ... exit status 255>
* Not so good.
Description
We should look into the various places in the APIs where these surface to ensure that this is the API we want before going to beta.
For example, Cubic is constructed with PointF objects, which it exposes as vals. This is worth considering in several ways:
- Having the points available for inspection is good, but we don't want the data to be changed in the underlying cubics (which is possible since PointF is mutable)
- It might not make sense to even have a public constructor; Cubics are necessary for callers to be able to inspect a Polygon (and maybe render debug information for it). But the object is mostly an implementation detail of this API, so it seems unnecessary to expose the constructor(s).
- We might want to use Float internally, regardless of what is used to construct a Cubic. This might help with both performance/allocations internally and avoiding mutability from the PointFs that callers can see
- If we use Float internally, and opt to avoid public constructors, then we might as well use Floats to construct Cubics as well, to avoid allocating PointF objects just to construct a Cubic
- If PointF is only used as a way to return data from a Cubic to a caller, then we could lazily allocate those objects