mask more /proc/cpuinfo output in KVM
Closed, ResolvedPublic

Description

Currently lots of information from inside a compromised workstation (or fancy application reading and reporting it somewhere for whatever statistic purpose) can be read:
https://www.whonix.org/wiki/Protocol-Leak-Protection_and_Fingerprinting-Protection#this_is_from_KVM_.2B_whonix_12_.28cat_.2Fproc.2Fcpuinfo_inside_WS.29

Seems like CPU features can be reduced:
https://www.berrange.com/posts/2010/02/15/guest-cpu-model-configuration-in-libvirt-with-qemukvm/

Add new 'kvm' domain feature and ability to hide KVM signature:
https://www.redhat.com/archives/libvir-list/2014-August/msg00744.html

Maybe more can be masked such as model and clock frequency.

As I understand, these features have been added to ease CPU migration in heterogeneous CPU environments. We can reuse these features to hide more hardware identifiers.

Needs research if there would be a performance penalty or something else would speak against this.

Details

Impact
Normal
Patrick created this task.Dec 7 2015, 4:48 PM
Patrick updated the task description. (Show Details)
Patrick raised the priority of this task from to Normal.
Patrick added projects: KVM, research, security.
Patrick set Impact to Normal.
Patrick added subscribers: Patrick, HulaHoop.

Some features exposed may even pose security issues. Whonix KVM's cat /proc/cpuinfo, flags includes tsc, which might allow the VM to access the host CPU's Time Stamp Counter (tsc). (Matters because of Clock Correlation Attacks.)

After reading https://libvirt.org/formatdomain.html#elementsCPU I suggest to start experimenting with something like the following (untested):

<cpu>
<cpu match='minimum'>
<model>486</model>
</cpu>

For CPU models, see also:
/usr/share/libvirt/cpu_map.xml

If we can afford it performance wise [and otherwise, depends for what it might break] we should only whitelist / allow required, "secure" CPU features, that we understand at least on their very superficial level.

I experimented with the cpu masking of individual features on different machines to test portability and reached the conclusion that it will cause a support nightmare.

Blocking some features like tsc_deadline_timer and constant_tsc will cause the vm to fail on other hardware that doesn't have it.

Forcing a specific cpu model will cause the vm to fail on hardware made by another vendor. Using a virtual cpu to solve this like kvm64 masks a couple of features more in total than what is accessible to the guest now including critical ones like hardware-virtualization which kills nested virtualization.

Even if all this was to work, an attacker in the VM can have a pretty good idea what cpu they are sitting on if they carry out some benchmarks.

What threat model are we considering for hiding the KVM signature? I am not opposed to it but I don't want security theater.

Research has shown it impossible to hide the fact that code is running in a virtual environment. Its described as useful for a debugging feature.

https://archive.is/J2pbv

tsc gives the rate of tick of the cpu which is what a pc calibrates its timers to when trying to keep a clock accurate set by the system. Nothing about the an actual reading of the time is leaked.

https://lwn.net/Articles/209101/
https://en.wikipedia.org/wiki/Time_Stamp_Counter

In T449#7518, @HulaHoop wrote:

I experimented with the cpu masking of individual features on different machines to test portability and reached the conclusion that it will cause a support nightmare.

Blocking some features like tsc_deadline_timer and constant_tsc will cause the vm to fail on other hardware that doesn't have it.

Aren't there settings to make this lenient? libvirt documentation implies there are.

  • feature require is obviously bad.
  • But why should feature disable "The feature will not be supported by virtual CPU." be bad? (As long as the VM operating system starts and runs fine.)

Also cpu match,

  • exact and strict sounds bad.
  • But cpu match minimum seems lenient. (Make little to no cpu flags a requirement.)

Also model fallback allow seems lenient.

Blocking some features like tsc_deadline_timer and constant_tsc will cause the vm to fail on other hardware that doesn't have it.

Isn't this a contradiction? How can you block something that does not exist? If the cpu feature is unwanted in the VM, then the hypervisor should be fine with the cpu feature not existing on the host, right?

Even if all this was to work, an attacker in the VM can have a pretty good idea what cpu they are sitting on if they carry out some benchmarks.

I also worry about [future] applications, not compromised ones, leaking CPU info, that are as fancy as webrtc (which leaks local IP addresses).

In T449#7519, @HulaHoop wrote:

What threat model are we considering for hiding the KVM signature?

I found that link only worth further research, because it discussed disabling cpu features.

In T449#7520, @HulaHoop wrote:

tsc gives the rate of tick of the cpu which is what a pc calibrates its timers to when trying to keep a clock accurate set by the system. Nothing about the an actual reading of the time is leaked.

https://lwn.net/Articles/209101/
https://en.wikipedia.org/wiki/Time_Stamp_Counter

I think leaking anything related to time from the host into the VM [or from VM to VM] risks cryptographic side channel attacks and should therefore be avoided.

Search term:
time stamp counter tsc side channel

Leads to:
http://blog.cr0.org/2009/05/time-stamp-counter-disabling-oddities.html

TSC leaks may be similar to to TCP sequence numbers. (https://trac.torproject.org/projects/tor/ticket/16659#comment:10)

TCP sequence numbers also do not directly leak time stamps.

(Will add notes here: https://www.whonix.org/wiki/Dev/TimeSync#TSC)

Yes all my tests were done with "feature disable".

If the cpu feature is unwanted in the VM, then the hypervisor should be fine with the cpu feature not existing on the host, right?

Thats what I thought too except it looks like the feature has to exist on the host for the mask to be applied or else it fails hard.

I also worry about [future] applications, not compromised ones, leaking CPU info, that are as fancy as webrtc (which leaks local IP addresses).

Can you give examples? At the moment all sensitive data like model name and microcode version is masked by default.

http://blog.cr0.org/2009/05/time-stamp-counter-disabling-oddities.html

Side channel attacks are a whole field that cryptographers have to deal with when designing and applying cryptography in the real world. They have to account for the fact that hardware is imperfect and can leak information about the key to attackers, Its not something we can solve here at such a simple level but something cryptographers account for when hacking on OpenSSL.

TSC leaks may be similar to to TCP sequence numbers.

Nothing is leaked to the network by TSC. Its also heavily affected by system load which caused the instruction to miss ticks so hardware manufacturers implemented constant_tsc to keep timers from skewing.

HulaHoop added a comment.EditedDec 8 2015, 2:31 AM

I'll read more about cpu minimum and report.

EDIT:

No go.

In T449#7525, @HulaHoop wrote:

I also worry about [future] applications, not compromised ones, leaking CPU info, that are as fancy as webrtc (which leaks local IP addresses).

Can you give examples? At the moment all sensitive data like model name and microcode version is masked by default.

There are no examples as of now, but I would not be surprised after webrtc leaking local client IP.

http://blog.cr0.org/2009/05/time-stamp-counter-disabling-oddities.html

Side channel attacks are a whole field that cryptographers have to deal with when designing and applying cryptography in the real world. They have to account for the fact that hardware is imperfect and can leak information about the key to attackers, Its not something we can solve here at such a simple level but something cryptographers account for when hacking on OpenSSL.

The goal should be to make the VM as autonomous and isolated as possible. A compromised browser VM should not be able to interfere with any other cryptographic operations on the host or in other VMs. If we can get rid of TSC before cryptographers come up with clever defenses, and before new clever attacks to these clever defenses have been published, that would be awesome.

TSC leaks may be similar to to TCP sequence numbers.

Nothing is leaked to the network by TSC.

Right. Normally not. Local compromise was assumed. Then the TSC could be analyzed by malware and/or send over the network. I see I am generating confusion by switching threat models.

I tested with both cpu minimum and feature disable but the settings cause the VM to never boot for the same reason.

If the hardware doesn't support it I can't mask it out.

I did some more research and there is good news to follow.

HulaHoop added a comment.EditedDec 12 2015, 4:32 AM

KVM cpus support a baseline of features by default. You can mask out the problematic ones and don't have to worry about the extra ones it doesn't support because it will be masked out anyhow (because it was never supported in the first place).

The only bad instructions we should filter out are a subset of whatever instructions are listed under the virtual cpu from the output of the cpu_map.xml list

cat /usr/share/libvirt/cpu_map.xml

I figured out safe defaults and will do a pull request. NB clflush was abused to carry out the rowhammer attack so its blacklisted. aes will be passed through for crypto performance - it doesn't mess with random number generation.

<cpu mode='custom' match='exact'>
  <model fallback='forbid'>qemu64</model>
  <topology sockets='1' cores='2' threads='1'/>
  <feature policy='disable' name='tsc'/>
  <feature policy='disable' name='clflush'/>
  <feature policy='optional' name='aes'/>
</cpu>

https://libvirt.org/formatdomain.html#elementsCPU

Informative link about cpu flag functionality:
https://unix.stackexchange.com/questions/43539/what-do-the-flags-in-proc-cpuinfo-mean

https://www.berrange.com/posts/2010/02/15/guest-cpu-model-configuration-in-libvirt-with-qemukvm/

"Every hypervisor has its own policies for what a guest will see for its CPUs by default, Xen just passes through the host CPU, with QEMU/KVM the guest sees a generic model called “qemu32” or “qemu64”. "

cat output:

 <model name='qemu64'>
  <!-- These are supported only by TCG.  KVM supports them only if the
       host does.  So we leave them out:

       <feature name='abm'/>
       <feature name='lahf_lm'/>
       <feature name='popcnt'/>
       <feature name='sse4a'/>
  -->
  <feature name='apic'/>
  <feature name='clflush'/>
  <feature name='cmov'/>
  <feature name='cx16'/>
  <feature name='cx8'/>
  <feature name='de'/>
  <feature name='fpu'/>
  <feature name='fxsr'/>
  <feature name='lm'/>
  <feature name='mca'/>
  <feature name='mce'/>
  <feature name='mmx'/>
  <feature name='msr'/>
  <feature name='mtrr'/>
  <feature name='nx'/>
  <feature name='pae'/>
  <feature name='pat'/>
  <feature name='pge'/>
  <feature name='pni'/>
  <feature name='pse'/>
  <feature name='pse36'/>
  <feature name='sep'/>
  <feature name='sse'/>
  <feature name='sse2'/>
  <feature name='svm'/>
  <feature name='syscall'/>
  <feature name='tsc'/>
</model>

Could you add a improved [redacted] /proc/cpuinfo output here please?
https://www.whonix.org/wiki/Protocol-Leak-Protection_and_Fingerprinting-Protection#.2Fproc.2Fcpuinfo_output

A note on typography:
Writing cpus is probably confusing. Because that's also the name of a linux printing service. Better to write CPUs.

Once TNT posts the results I'll update the wiki. Should I replace the older logs or paste under them?

I also enabled the hidden KVM feature.

HulaHoop (HulaHoop):

Should I replace the older logs or paste under them?

I guess it's better to keep for hysteric comparison. Perhaps we should
use expand buttons for all the logs.

Historic comparison.

UPDATE:

When disabling MSR the systemd-detect-virt output changes form "kvm" to "qemu". This breaks the shared folder automatic mounting that relies on seeing "kvm" from virt-detect and also whonixcheck which only recognizes the hypervisor that way only.

Its not worth the trouble so commits are rolled back.

HulaHoop closed this task as Resolved.Jun 7 2016, 3:54 PM
HulaHoop claimed this task.

Done. New output added and confirms masking out problematic CPU instructions.