Status Update
Comments
pa...@google.com <pa...@google.com>
ga...@gmail.com <ga...@gmail.com> #2
Could you CC the reviewers to this bug so that they can access it - just so we can get the relevant people involved.
Also is it possible to attach the patch to this bug report so we can see what is involved on the clang side?
Also is it possible to attach the patch to this bug report so we can see what is involved on the clang side?
cr...@gmail.com <cr...@gmail.com> #3
adding Chandler, who will have useful opinions on landing mitigations for embargoed vulnerabilities in the LLVM tree
In the past (e.g. when Intel disclosed LVI) our approach was to have patches ready to go and pre-reviewed by appropriate (and appropriately read-in) code owners but not post the patches "for real" until the embargo lifted. Usually this was part of a larger comms strategy where announcements were made in other forums.
Is there an entity beyond security@kernel.org that's coordinating this disclosure?
In the past (e.g. when Intel disclosed LVI) our approach was to have patches ready to go and pre-reviewed by appropriate (and appropriately read-in) code owners but not post the patches "for real" until the embargo lifted. Usually this was part of a larger comms strategy where announcements were made in other forums.
Is there an entity beyond security@kernel.org that's coordinating this disclosure?
fe...@gmail.com <fe...@gmail.com> #4
(re-reading, my second paragraph just repeats what you already laid out. Good independent verification but apologies for not reading the dates more carefully.)
ma...@gmail.com <ma...@gmail.com> #5
I don't have permission to CC folks it seems, but it would be worth adding:
* Aaron Ballman <aaron@aaronballman.com> (a code owner for clang who has already reviewed my patch off list)
* Craig Topper <craig.topper@gmail.com> (a code owner for the x86 backend who has already reviewed my patch off list)
x86 is the only confirmed architecture affected. ARM has confirmed they're not affected for their micro-architectures. Not all x86 micro-architectures are affected. Not seeing one particular x86 vendor represented in the LLVM Security Group is...concerning.
> Also is it possible to attach the patch to this bug report so we can see what is involved on the clang side?
Yes, but just note that the LLVM code is somewhat a giveaway of what's going on here. GCC has had a similar feature for years; they did not publish the patch under embargo but rather did so publicly back when various spectre and meltdown mitigations were being tested before the development of retpoline. The very first comment on the GCC mailing list was along the lines of "What is this?" I'd like to avoid that here if possible; the name of the game is to not spill the beans to too wide an audience prematurely. Doing so would lose LLVM their (lone) seat at the table for Linux kernel vulnerability disclosures such as this. So we need to be super careful about keeping this need to know.
> Is there an entity beyond security@kernel.org that's coordinating this disclosure?
The encrypted kernel mailing list I'm referring to that is coordinating is not that specific email address. It's more controlled than even that list and members of that list (security@kernel.org) aren't even considered need to know for this one. I don't know whether linux-distros@vs.openwall.org has even been contacted yet; the kernel patch set is up to ~45 patches and growing, and backports will be painful.
* Aaron Ballman <aaron@aaronballman.com> (a code owner for clang who has already reviewed my patch off list)
* Craig Topper <craig.topper@gmail.com> (a code owner for the x86 backend who has already reviewed my patch off list)
x86 is the only confirmed architecture affected. ARM has confirmed they're not affected for their micro-architectures. Not all x86 micro-architectures are affected. Not seeing one particular x86 vendor represented in the LLVM Security Group is...concerning.
> Also is it possible to attach the patch to this bug report so we can see what is involved on the clang side?
Yes, but just note that the LLVM code is somewhat a giveaway of what's going on here. GCC has had a similar feature for years; they did not publish the patch under embargo but rather did so publicly back when various spectre and meltdown mitigations were being tested before the development of retpoline. The very first comment on the GCC mailing list was along the lines of "What is this?" I'd like to avoid that here if possible; the name of the game is to not spill the beans to too wide an audience prematurely. Doing so would lose LLVM their (lone) seat at the table for Linux kernel vulnerability disclosures such as this. So we need to be super careful about keeping this need to know.
> Is there an entity beyond security@kernel.org that's coordinating this disclosure?
The encrypted kernel mailing list I'm referring to that is coordinating is not that specific email address. It's more controlled than even that list and members of that list (security@kernel.org) aren't even considered need to know for this one. I don't know whether linux-distros@vs.openwall.org has even been contacted yet; the kernel patch set is up to ~45 patches and growing, and backports will be painful.
75...@gmail.com <75...@gmail.com> #6
I'm curious what you want from us, if the vulnerability is too sensitive to talk about with (or show a mitigation patch to) the people who have signed up to be responsible about security vulnerabilities.
pi...@gmail.com <pi...@gmail.com> #7
[Empty comment from Monorail migration]
mg...@gmail.com <mg...@gmail.com> #8
It seems like the question is: what is the best way for Nick to prepare for this patch to be available in LLVM, so it can benefit LLVM users as soon as possible after the vulnerability is disclosed?
If that's the question, I think the answer is already reached in the discussion above: to the extent possible, get the patch reviewed early by code owners and resolve any concerns there; then post the patch to Phabricator when the embargo lifts. (Absent some tricky KAISER-style subterfuge, posting the patch should be considered equivalent to a public disclosure of the vulnerability.)
Open to others on the list refining or rebutting that message, but otherwise -- Nick, is there anything else you think you need from us?
If that's the question, I think the answer is already reached in the discussion above: to the extent possible, get the patch reviewed early by code owners and resolve any concerns there; then post the patch to Phabricator when the embargo lifts. (Absent some tricky KAISER-style subterfuge, posting the patch should be considered equivalent to a public disclosure of the vulnerability.)
Open to others on the list refining or rebutting that message, but otherwise -- Nick, is there anything else you think you need from us?
vi...@gmail.com <vi...@gmail.com> #9
> I'm curious what you want from us
> Nick, is there anything else you think you need from us?
Maybe if Aaron and Craig can confirm here that they've reviewed the patch, and if they'll be available when the embargo is scheduled to lift to publicly Accept the patch?
Otherwise, is there anything else I should be doing as part of the process?
> posting the patch should be considered equivalent to a public disclosure of the vulnerability.
Correct. I don't mind posting it here, but it MUST NOT be posted to phab until the embargo lift (scheduled for Tuesday July 12 2022 9am PDT). I'm more than happy to spend the rest of the week iterating on feedback of the initial design, but we need to ship a mitigation ASAP in a few open source repositories. The code has been reviewed by trusted reviewers and tested by trusted kernel developers, so I'm not looking to re-architect the patch one week out from embargo lift.
Patch attached.
> Nick, is there anything else you think you need from us?
Maybe if Aaron and Craig can confirm here that they've reviewed the patch, and if they'll be available when the embargo is scheduled to lift to publicly Accept the patch?
Otherwise, is there anything else I should be doing as part of the process?
> posting the patch should be considered equivalent to a public disclosure of the vulnerability.
Correct. I don't mind posting it here, but it MUST NOT be posted to phab until the embargo lift (scheduled for Tuesday July 12 2022 9am PDT). I'm more than happy to spend the rest of the week iterating on feedback of the initial design, but we need to ship a mitigation ASAP in a few open source repositories. The code has been reviewed by trusted reviewers and tested by trusted kernel developers, so I'm not looking to re-architect the patch one week out from embargo lift.
Patch attached.
gk...@gmail.com <gk...@gmail.com> #10
Couple comments on the patch
"be replace with" -> "be replaced with" in the LangRef.rst change
The include list in X86ReturnThunks.cpp seems to have more files than needed. StringSwitch.h was an obvious extra.
nit: "Modified |= true;" it's pretty uncommon for "|= true" to appear in the tree. Nearly everywhere does Modified = true.
"be replace with" -> "be replaced with" in the LangRef.rst change
The include list in X86ReturnThunks.cpp seems to have more files than needed. StringSwitch.h was an obvious extra.
nit: "Modified |= true;" it's pretty uncommon for "|= true" to appear in the tree. Nearly everywhere does Modified = true.
[Deleted User] <[Deleted User]> #11
I obviously don't know the exact details of the exploit but is replacing ret with a jmp absolutely necessary - my experience in other domains suggests that messing with the x86 call stack cache does pretty terrible things to performance.
If it is, do we know if this is actually x86 specific? zero-daying every other architecture is also not a super great outcome.
If it is, do we know if this is actually x86 specific? zero-daying every other architecture is also not a super great outcome.
su...@gmail.com <su...@gmail.com> #12
@oliver: my understanding of the patch is that it is not messing up with the call stack, just using a trampoline. So I assume it would be nice to have the trampoline target in a hot code section as it's likely to be shared by a lot of functions, but that's it, right?
gi...@gmail.com <gi...@gmail.com> #13
My specific experience was JIT and interpreter insanity on x86_64, where there was a measurable impact because x86 (at least intel) include a call stack predictor and we did have a measurable cost from do some messing with the return address of functions.
To be clear: I am not saying this is a show stopper or anything like that, we were benchmarking JS so the type of code that is running in such is obviously different from pretty much anything else, this is more that in my experience this kind of manipulation *can* have 2nd order(?) perf impact, so I'd like to know someone has checked to make sure nothing unusable happens.
Pessimistic me would have a benchmark that was measuring deep recursion and iterative calling of an empty function. Not because it's representative, but just to get what is the impact in the worst possible case we can come up with. If it's not significant in that case there's no reason to spend any time doing the more complicated task of seeing if it impacts anything in real world code.
If there is a non-negligible impact we may be able to change the exact code gen or similar to compensate.
Or we may just accept the cost as being outweighed by the impact (as happened with SPECTRE, etc)
To be clear: I am not saying this is a show stopper or anything like that, we were benchmarking JS so the type of code that is running in such is obviously different from pretty much anything else, this is more that in my experience this kind of manipulation *can* have 2nd order(?) perf impact, so I'd like to know someone has checked to make sure nothing unusable happens.
Pessimistic me would have a benchmark that was measuring deep recursion and iterative calling of an empty function. Not because it's representative, but just to get what is the impact in the worst possible case we can come up with. If it's not significant in that case there's no reason to spend any time doing the more complicated task of seeing if it impacts anything in real world code.
If there is a non-negligible impact we may be able to change the exact code gen or similar to compensate.
Or we may just accept the cost as being outweighed by the impact (as happened with SPECTRE, etc)
sh...@vitalum.ac.in <sh...@vitalum.ac.in> #14
> Maybe if Aaron and Craig can confirm here that they've reviewed the patch, and if they'll be available when the embargo is scheduled to lift to publicly Accept the patch?
I can't speak for Craig, but I reviewed the Clang frontend bits and am happy with them. I will be around on Jul 12 and should have no trouble accepting the patch pretty quickly.
I can't speak for Craig, but I reviewed the Clang frontend bits and am happy with them. I will be around on Jul 12 and should have no trouble accepting the patch pretty quickly.
ib...@gmail.com <ib...@gmail.com> #15
Re https://crbug.com/llvm/33#c12 , yes, I think it's very safe to say (with no special knowledge about this particular vulnerability) that the impact on performance was evaluated and judged to be worth it given the security implications. As you point out, similar tradeoffs have been made in mitigating other transient execution vulnerabilities.
Aaron, thanks for the confirmation inhttps://crbug.com/llvm/33#c13 that you're ready to review+approve when the embargo lifts. I think having those commitments here is the best outcome for this bug.
Aaron, thanks for the confirmation in
ma...@gmail.com <ma...@gmail.com> #16
I have reviewed the backend portion and you can see some comments in https://crbug.com/llvm/33#c9 . Those comments were very minor. I'm happy with the backend portion. I will be around on Jul 12 and can approve quickly as well.
Description