Fixed
Status Update
Comments
la...@google.com <la...@google.com> #2
Ok, so r17 now ships gold as the default linker. That explains why things broke all of a sudden (I've confirmed that r17 ld.bfd still works ok). It also means there is just one place that needs fixing..
ll...@google.com <ll...@google.com> #3
so, it is ok for you to use bfd until this issue is fixed?
la...@google.com <la...@google.com> #4
well... I don't really mind sticking to bfd, but I think the question is what's the communication towards ndk users going to be? Are we going to stick a warning label on the r17 that it might degrade debug experience?
I think we should re-evaluate making gold the default if this cannot be fixed until r17 goes stable...
I think we should re-evaluate making gold the default if this cannot be fixed until r17 goes stable...
da...@google.com <da...@google.com> #5
Agreed, we shouldn't ship with this broken. https://android-review.googlesource.com/#/c/platform/ndk/+/584809 switches us back to bfd.
Luis: We do want to get all of our ABIs over to the same linker soon to cut down on variations between architectures as much as possible, and aarch64 is the only one still on bfd. I don't think we need to do this with r17 (Q1), but it would be nice to have a fix for r18 (Q2).
Luis: We do want to get all of our ABIs over to the same linker soon to cut down on variations between architectures as much as possible, and aarch64 is the only one still on bfd. I don't think we need to do this with r17 (Q1), but it would be nice to have a fix for r18 (Q2).
ll...@google.com <ll...@google.com> #6
ok. thanks for clarifying the timeline for this.
I will assing to Rahul to triage.
I will assing to Rahul to triage.
da...@google.com <da...@google.com>
da...@google.com <da...@google.com> #7
Any update on this?
ra...@google.com <ra...@google.com> #8
You need to pass '--apply-dynamic-relocs' flag to gold. i.e.
$ .../aarch64-linux-android/bin/ld -m aarch64linux --apply-dynamic-relocs -shared -o liba.so a.o
Looks like this flag was added to gold specifically for Android:https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=0eccf19f96d6218dd7c2f8d13f0546c2b942cc08
Upstream has '--apply-dynamic-relocs' as default, so it does the right thing.
In Android, we set '--no-apply-dynamic-relocs' as the default:https://android-review.googlesource.com/368019
This explains why the debug info is getting dropped.
I don't know the history behind this default in Android.
What is 'broken Android dynamic linkers' referring to in the upstream commit?
Is that still relevant? Can we switch the default to '--apply-dynamic-relocs' now?
$ .../aarch64-linux-android/bin/ld -m aarch64linux --apply-dynamic-relocs -shared -o liba.so a.o
Looks like this flag was added to gold specifically for Android:
Upstream has '--apply-dynamic-relocs' as default, so it does the right thing.
In Android, we set '--no-apply-dynamic-relocs' as the default:
This explains why the debug info is getting dropped.
I don't know the history behind this default in Android.
What is 'broken Android dynamic linkers' referring to in the upstream commit?
Is that still relevant? Can we switch the default to '--apply-dynamic-relocs' now?
en...@google.com <en...@google.com> #9
> Is that still relevant? Can we switch the default to '--apply-dynamic-relocs' now?
for the platform, definitely. the NDK might be more tricky though.
i suspect this was the fix? it's in the right part of the code at around the right time:
Fix R_AARCH64_ABS/PREL relocations
According to specification arm64 relocations
should not use *reloc value.
Seehttp://infocenter.arm.com/help/topic/com.arm.doc.ihi0056b/IHI0056B_aaelf64.pdf
section 4.6.5
Bug:http://b/24977219
Bug:http://b/24527155
Change-Id: I3813255771f408ba957963c6ad56ed08e5110d83
https://android-build.googleplex.com/builds/where-is-my-cl/search/77f91c6d99c25fce4fbf9704aa6f7232fb624ff4 suggests that was broken in L but fixed in M. (i don't know how many of the 25% of users who're on L have 64-bit devices. probably very few, but...)
for the platform, definitely. the NDK might be more tricky though.
i suspect this was the fix? it's in the right part of the code at around the right time:
Fix R_AARCH64_ABS/PREL relocations
According to specification arm64 relocations
should not use *reloc value.
See
section 4.6.5
Bug:
Bug:
Change-Id: I3813255771f408ba957963c6ad56ed08e5110d83
ra...@google.com <ra...@google.com> #10
Yes, that seems like the relevant fix in the Android linker. It looks to me that:
1. Platform can use '--apply-dynamic-relocs' as the default, since the bionic linker is fixed now.
2. NDK can also use '--apply-dynamic-relocs' as the default, since it currently has ld.bfd as the default linker, which doesn't support '--no-apply-dynamic-relocs'.
In other words, making '--apply-dynamic-relocs' default and switching to ld.gold as default linker would result in no change in behavior.
Only problematic case is if someone has been using 'aarch64-linux-android-ld.gold' explicitly from the NDK, and relying on a default of '--no-apply-dynamic-relocs'.
1. Platform can use '--apply-dynamic-relocs' as the default, since the bionic linker is fixed now.
2. NDK can also use '--apply-dynamic-relocs' as the default, since it currently has ld.bfd as the default linker, which doesn't support '--no-apply-dynamic-relocs'.
In other words, making '--apply-dynamic-relocs' default and switching to ld.gold as default linker would result in no change in behavior.
Only problematic case is if someone has been using 'aarch64-linux-android-ld.gold' explicitly from the NDK, and relying on a default of '--no-apply-dynamic-relocs'.
la...@google.com <la...@google.com> #11
Great stuff, thanks for getting to the bottom of this.
pc...@google.com <pc...@google.com> #12
Why does --apply-dynamic-relocs affect the behaviour relating to debug info sections, which should after all not have dynamic relocations? It seems that gold should have the --apply-dynamic-relocs behaviour for all non-SHF_ALLOC sections, regardless of the flag value.
ra...@google.com <ra...@google.com> #13
You're right. Passing '--no-apply-dynamic-relocs' should not change anything for debug info sections.
Although at this point, it is probably preferable to remove this flag instead of fixing all corner cases.
Although at this point, it is probably preferable to remove this flag instead of fixing all corner cases.
pc...@google.com <pc...@google.com> #14
Are you sure you want the default for this flag to be on? It means that the addends will be copied into the data, making the .so file less compressible. (I guess this would be less of an issue once most apps have switched over to RELR, but that won't happen for a long time.)
ra...@google.com <ra...@google.com> #15
I think what you're describing is how ld.lld / ld.bfd implement this flag, and it makes sense.
ld.gold however, seems to have always been copying the addends to the data (at least for R_AARCH64_RELATIVE relocations).
I have been doing testing for RELR on aarch64 and I've never once passed '--apply-dynamic-relocs' to get it to work.
.relr.dyn section processing in the bionic/glibc loaders has been working fine without needing it!
Also, seehttps://b.corp.google.com/issues/67108677#comment29 where I was investigating a kernel image size increase after upgrade to binutils-2.27.
I discovered the same thing there: with binutils-2.25, both bfd and gold linkers stored the addends at the relocation offset for x86_64, but only the gold linker did this for aarch64.
This upstream commit changes the default for ld.bfd, and makes the resulting image less compressible as you suggested:https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613
On the other hand, as evidenced by this bug, ld.gold skips applying dynamic relocations into debug info sections with '--no-apply-dynamic-relocs'.
This does not make sense to me.
Maybe I'm missing something, but it seems like gold implementation of '--no-apply-dynamic-relocs' has never worked as intended (or in the same way as ld.bfd / ld.lld interpret this flag).
ld.gold however, seems to have always been copying the addends to the data (at least for R_AARCH64_RELATIVE relocations).
I have been doing testing for RELR on aarch64 and I've never once passed '--apply-dynamic-relocs' to get it to work.
.relr.dyn section processing in the bionic/glibc loaders has been working fine without needing it!
Also, see
I discovered the same thing there: with binutils-2.25, both bfd and gold linkers stored the addends at the relocation offset for x86_64, but only the gold linker did this for aarch64.
This upstream commit changes the default for ld.bfd, and makes the resulting image less compressible as you suggested:
On the other hand, as evidenced by this bug, ld.gold skips applying dynamic relocations into debug info sections with '--no-apply-dynamic-relocs'.
This does not make sense to me.
Maybe I'm missing something, but it seems like gold implementation of '--no-apply-dynamic-relocs' has never worked as intended (or in the same way as ld.bfd / ld.lld interpret this flag).
da...@google.com <da...@google.com> #16
So what's the fix here? Should we revert the patch that makes --no-apply-dynamic-relocs the default? We're really like to get aarch64 onto gold so we're consistent across ABIs.
ra...@google.com <ra...@google.com> #17
Sounds good to me. We've just done that on Chrome OS side: https://chromium-review.googlesource.com/999176
Elliott has concerns ( #comment9 ) about changing the default for the NDK.
Elliott has concerns (
ll...@google.com <ll...@google.com> #18
enh? Dan ? what should be done here? (assigning to enh for now)
ll...@google.com <ll...@google.com> #19
(I cannot change assignee)
sr...@google.com <sr...@google.com>
en...@google.com <en...@google.com> #20
on the public bug (https://github.com/android-ndk/ndk/issues/556 ), danalbert said "Given that we're now shipping lld, I'm guessing this won't ever happen and we'll just go straight to LLD.". NDK r17 already shipped with ld.bfd rather than ld.gold, so i guess we may as well just call aarch64 gold dead at this point...
ll...@google.com <ll...@google.com> #21
SGTM.
ap...@google.com <ap...@google.com> #22
Project: toolchain/binutils
Branch: master
commit 49e1641cfc154895c5c5f394bb23c9518df2cb49
Author: Yunlian Jiang <yunlian@google.com>
Date: Thu Jul 12 15:54:35 2018
Fix aarch64 --no-apply-dynamic-relocs option.
This option is intended to make gold not apply link-time values for
absolution relocations which have dynamic relocations emitted for
them, in order to workaround an android dynamic loader bug in old
versions of android.
Unfortunately, it also had the side-effect of breaking debug data,
because the dynamic relocations are not used for non-ALLOC sections,
but the flag was also suppressing the static relocation.
This fix was proposed by jyknight@ and it fixed the problem by filtering
out non-ALLOC sections with the --no-apply-dynamic-relocs option.
BUG: 70838247
TEST: Debug info appears with --no-apply-dynamic-relocs
M binutils-2.27/gold/aarch64.cc
https://android-review.googlesource.com/716724
https://goto.google.com/android-sha1/49e1641cfc154895c5c5f394bb23c9518df2cb49
Branch: master
commit 49e1641cfc154895c5c5f394bb23c9518df2cb49
Author: Yunlian Jiang <yunlian@google.com>
Date: Thu Jul 12 15:54:35 2018
Fix aarch64 --no-apply-dynamic-relocs option.
This option is intended to make gold not apply link-time values for
absolution relocations which have dynamic relocations emitted for
them, in order to workaround an android dynamic loader bug in old
versions of android.
Unfortunately, it also had the side-effect of breaking debug data,
because the dynamic relocations are not used for non-ALLOC sections,
but the flag was also suppressing the static relocation.
This fix was proposed by jyknight@ and it fixed the problem by filtering
out non-ALLOC sections with the --no-apply-dynamic-relocs option.
BUG: 70838247
TEST: Debug info appears with --no-apply-dynamic-relocs
M binutils-2.27/gold/aarch64.cc
rp...@google.com <rp...@google.com> #23
I confirmed that r19 and r19c ld.gold arm64 has the aosp/716724 fix, which fixes the test case in
Description
int g_a = 123;
$ ~/ll/build/opt/bin/clang -target aarch64-unknown-linux-android a.c -o a.o -c -g # This is ToT clang, but it doesn't matter
$ ~/local/android-ndk-r16-beta2/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/aarch64-linux-android/bin/ld -m aarch64linux -shared -o liba.so a.o # link with r16 ld
$ ~/ll/build/opt/bin/llvm-dwarfdump liba.so
...
0x0000001e: DW_TAG_variable
DW_AT_name ("g_a")
DW_AT_type (cu + 0x0033 "int")
DW_AT_external (true)
DW_AT_decl_file ("/tmp/r17-test/a.c")
DW_AT_decl_line (1)
DW_AT_location (DW_OP_addr 0x10360) <=== variable at address 0x10360, points straight into the data section
$ readelf -S liba.so
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[ 7] .data PROGBITS 0000000000010360 00000360
0000000000000004 0000000000000000 WA 0 0 4
$ ~/local/android-ndk-r17-canary/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/aarch64-linux-android/bin/ld -m aarch64linux -shared -o liba.so a.o # now link with r17 ld
$ ~/ll/build/opt/bin/llvm-dwarfdump liba.so
....
0x0000001e: DW_TAG_variable
DW_AT_name ("g_a")
DW_AT_type (cu + 0x0033 "int")
DW_AT_external (true)
DW_AT_decl_file ("/tmp/r17-test/a.c")
DW_AT_decl_line (1)
DW_AT_location (DW_OP_addr 0x0) <==== Address is gone
$ readelf -r liba.so
There are no relocations in this file.
^^^ The linker did not just forget to apply the relocation. It ate it completely.
Some interesting notes:
- If I drop the "-shared" (i.e., generate a main executable), the debug info comes out alright
- I tried the same thing with ld.gold, and I've learned that gold was broken at least since r15 (I haven't tried going further back)
- non-aarch64 targets don't seem to be affected