more on Spectre v2

Spectre v2 is really complicated to deal with. Just read this recent thread in the LKML <https://lkml.org/lkml/2018/1/20/158> I'm impressed how well written those messages are. Intel Skylake CPUs are particularly problematic. Those are most of the "6th generation Core" processors and some of the "7th generation Core". <https://en.wikipedia.org/wiki/Skylake_(microarchitecture)> The indirect branch predictor is a big problem. The retpoline deals with most cases. On Skylake, this predictor is used in another case: for return instructions that cause underflow in the call/return predictor stack. That means that EVERY return instruction is suspect, and the cost of being suspect is high. OK, if you can prove that the call/return predictor stack has not underflowed, you can just do a return. But how? The best fix is not on the table: it would be great if Intel could patch microcode so that the return predictor did not fall back to the indirect branch predictor. It sounds easy, but I infer that it is not technically possible.

Spectre v2 is really complicated to deal with. Just read this recent thread in the LKML <https://lkml.org/lkml/2018/1/20/158> I'm impressed how well written those messages are.
Intel Skylake CPUs are particularly problematic. Those are most of the "6th generation Core" processors and some of the "7th generation Core". <https://en.wikipedia.org/wiki/Skylake_(microarchitecture)>
The indirect branch predictor is a big problem. The retpoline deals with most cases. On Skylake, this predictor is used in another case: for return instructions that cause underflow in the call/return predictor stack. There's also a race condition: the 6130 and DPS8m processors checked the
On 04/02/18 01:29 PM, D. Hugh Redelmeier via talk wrote: permissions of the fetch before they fetched to the (then small) cache. Simulating the same thing in the Multics emulator took some extra work but it was implemented well before the spectre attacks showed up. -- David Collier-Brown, | Always do right. This will gratify System Programmer and Author | some people and astonish the rest davecb@spamcop.net | -- Mark Twain

| From: David Collier-Brown via talk <talk@gtalug.org> | There's also a race condition: the 6130 and DPS8m processors checked the | permissions of the fetch before they fetched to the (then small) cache. [These are Honeywell mainframe computers.] Spectre V2 is about subverting branch target prediction. Something quite different. BUT this subversion only matters because speculation of memory fetches leaves a trace. So what you mention is relevant, indirectly. Checking permissions can be quite expensive: going through multi-level page tables can involve several memory fetches (which you'd like to speculate past). Obvious cure: keep permissions in each cache line. The trouble is that permissions can be bulky, forcing a reduction in the effective size of the cache. AMD may have done more of this than Intel <https://www.amd.com/en/corporate/speculative-execution> GPZ Variant 3 (Rogue Data Cache Load or Meltdown) is not applicable to AMD processors. We believe AMD processors are not susceptible due to our use of privilege level protections within paging architecture and no mitigation is required. Intel seems to have added PCID to each cache line. It is too small to encode a process number but you could assign a PCID number to each active process/thread and when you run out, do hard work. It would be kind of analogous to the way we allocate page frames (real memory) to pages of processes (virtual memory). Linux did not use PCID but it seems that Spectre and Meltdown are stimulating interest in it. | Simulating the same thing in the Multics emulator took some extra work but | it was implemented well before the spectre attacks showed up. I assume that we're not talking about a cycle-accurate emulator nor an emulator done in hardware. How would it be extra work in an emulator? An emulator doesn't normally do speculative execution. For normal path execution, of course you must emulate the permissions model of the hardware.

On 05/02/18 11:50 AM, D. Hugh Redelmeier via talk wrote:
| From: David Collier-Brown via talk <talk@gtalug.org>
| There's also a race condition: the 6130 and DPS8m processors checked the | permissions of the fetch before they fetched to the (then small) cache.
[These are Honeywell mainframe computers.]
Spectre V2 is about subverting branch target prediction. Something quite different.
BUT this subversion only matters because speculation of memory fetches leaves a trace. So what you mention is relevant, indirectly.
Checking permissions can be quite expensive: going through multi-level page tables can involve several memory fetches (which you'd like to speculate past).
Obvious cure: keep permissions in each cache line. The trouble is that permissions can be bulky, forcing a reduction in the effective size of the cache. AMD may have done more of this than Intel <https://www.amd.com/en/corporate/speculative-execution>
That was the reason I earlier mentioned the (vague!) Oracle/Fujitsu SPARC scheme for marking cache lines in different colors. It was suspiciously correlated with the covert channel that the Spectre attacks were using.
GPZ Variant 3 (Rogue Data Cache Load or Meltdown) is not applicable to AMD processors.
We believe AMD processors are not susceptible due to our use of privilege level protections within paging architecture and no mitigation is required.
Intel seems to have added PCID to each cache line. It is too small to encode a process number but you could assign a PCID number to each active process/thread and when you run out, do hard work. It would be kind of analogous to the way we allocate page frames (real memory) to pages of processes (virtual memory).
Linux did not use PCID but it seems that Spectre and Meltdown are stimulating interest in it.
| Simulating the same thing in the Multics emulator took some extra work but | it was implemented well before the spectre attacks showed up.
I assume that we're not talking about a cycle-accurate emulator nor an emulator done in hardware.
How would it be extra work in an emulator? An emulator doesn't normally do speculative execution. For normal path execution, of course you must emulate the permissions model of the hardware. Yes. The emulator authors noted that emulating the permissions on a fetch were surprisingly complex, in part because the hardware they were emulating was designed in the middle of the period when covert channels were a major concern and had the mandatory-access-control people scratching their heads over ways to avoid them.
--dave -- David Collier-Brown, | Always do right. This will gratify System Programmer and Author | some people and astonish the rest davecb@spamcop.net | -- Mark Twain
participants (2)
-
D. Hugh Redelmeier
-
David Collier-Brown