Page MenuHomePhabricator

virtualizer: enforce maximum system resources a virtual machine may use
Open, NormalPublic

Description

An adversary could stress either/and CPU, HDD, RAM, network connection and other Whonix-Workstations and perhaps also the host would suffer. This is bad:

  • attacks on anonymity when using multiple workstations (whether behind same gateway or not)
  • host ddos

Virtual machines (VM) can use an unlimited amount of resources. I.e.

  • CPU load
  • network load
  • I/O (hdd) load
  • graphic calculation load
  • (RAM load?)

This might happen because some application inside a VM has a bug and starts draining resources or because a VM has been compromised.

Ideally the virtualizer on the host would enforce maximum system resources the VM may use.

This ticket is a reminder to implement this protection for all virtualizers supported by Whonix some day.

If someone wants to implement this feature for a particular virtualizer, please create a sub task to keep things separated.

Related:
T530

Details

Impact
Normal

Event Timeline

Patrick created this task.Nov 20 2014, 7:38 PM
Patrick updated the task description. (Show Details)
Patrick raised the priority of this task from to Normal.
Patrick added a subscriber: Patrick.
Patrick changed the visibility from "All Users" to "Public (No Login Required)".Nov 20 2014, 7:49 PM
Patrick changed the edit policy from "All Users" to "Administrators".
Patrick set Impact to Needs Triage.May 23 2015, 1:39 AM
Patrick added subscribers: troubadour, HulaHoop, nrgaway.
HulaHoop added a comment.EditedMay 25 2015, 3:14 AM

For CPU, setting the Linux scheduler to the Nice level for a VM process can limit effect of CPU DoS. Using cgroups, virtual machines can also be restricted to CPU limits which supports the prevention of denial of service.

For memory, reducing or disabling swap puts limits on memory DoS.

Even without any changes to Host DoS can be effectively mitigated:
Limiting the number of logical CPUs for a VM in its settings limits the amount of Host cores it can use.
Linux's OOM-killer will identify and kill the offending VM process if causing the host to run out of memory.

Patrick updated the task description. (Show Details)Aug 3 2016, 4:09 PM
Patrick changed Impact from Needs Triage to Normal.

blkiotune and iotune can restrict io (KVM only)

https://libvirt.org/formatdomain.html#elementsBlockTuning

Patrick updated the task description. (Show Details)Nov 11 2016, 3:54 PM
Patrick updated the task description. (Show Details)

There's a problem with setting this. SSD vs HDD io throughput is very different. What is reasonable for one will be excessive or too low for the other.

HulaHoop (HulaHoop):

HulaHoop added a comment.

There's a problem with setting this. SSD vs HDD io throughput is very different. What is reasonable for one will be excessive or too low for the other.

Set the limit to protect against excessive use of SSD then. SSD users
are better off then and HDD users not any worse. (HDD users would have
to manually adjust that value.) (Or auto detect somehow?)

Though I agree with anonym's argument that resource exhaustion goes against the purpose of advanced malware that wants to hide - I still looked at io limits in case you still think its valuable to set.

Turns out blkiotune can partition resources using "weighting" aka percentage besides setting arbitrary limits.

I think by "host block device" they refer to the physical device /dev/sda rather than getting into partition numbers like sda2. The latter would have made this option very messy.

https://www.debian.org/releases/stable/i386/apcs04.html.en

https://libvirt.org/formatdomain.html#elementsBlockTuning

HulaHoop (HulaHoop):

HulaHoop added a comment.

Though I agree with anonym's argument that resource exhaustion goes
against the purpose of advanced malware that wants to hide

Maybe. However, I wouldn't assume that no type of malware ever tries
resource exhaustion.

I still looked at io limits in case you still think its valuable to set.

Yes, very much worthwhile. Should have always been a standard feature of
any virtualizer. It's what one can reasonably expect, that no VM (or
application) is capable to exhaust resources so everything else suffers.
Even if not malware, but just a bug, it still is a huge usability issue
to run into cases where one thing can look up the whole computer.

Should limits be enforced for GW too?

Done. Added io limit commits to open pull requests. Each vm can only use a maximum of 25% of the host io resources.