Page MenuHomePhabricator

[Revised] Clock Drift Correction Proposal
Open, NormalPublic

Description

Here is what we know:
*You don't need QEMU guest-agent to suspend or resume guests. This is possible with virsh.
*This works because of ACPI S3 events. This means I will have to reinstate ACPI once more but that seems a small tradeoff for completing this task. What do you think?
*QEMU guest-agent is safe to use if we decide to go that route. There are some interesting things we can do with the agent if we ever need it in the future. This proposal assumes we are NOT using it.

Host package has two parts, one for manipulating GWs and the other for WSs. It must enumerate all VMs and then acts in 1 of 2 ways according to whether they are designated as GW or WS. This is crucial for scaling to Multi Whonix setups. This can be designed to happen during during boot of a new session instead of before suspend. Process: Enumerate all Whonix VMs recognized by livbirt by regexing to determine if they have "Whonix-Gateway" or "Whonix-Workstation" in their names.

virsh is the tool of choice. That should work for Qubes too because they use libvirt. Theoretically this works for VBox on Linux if using libvirt but on non supported platforms like vanilla VBox or Vbox Windows this design won't break usability since the guest doesn't experience suspend/resume ACPI events.

-System suspend:

*Upon host system suspend it just sends a suspend-to-ram signal for all VMs via guest ACPI.


-System resume:

*GW resumed first (to prevent race-conditions), monit checks logs, triggers iptables lockdown for enough time until WS iptables takes over, WSs timesync traffic exempt. timesync is started for the GW.

*WSs resumed later 5-10s later, start iptables lockdown, WS timesync initiated.


This ticket is a summary of the best way to implement T550 and supersedes it. Any technical references can be found in that thread.

Details

Impact
Normal

Related Objects

Event Timeline

HulaHoop created this task.Mar 1 2018, 6:42 PM
HulaHoop triaged this task as Normal priority.

HulaHoop (HulaHoop):

-System suspend: *Upon host system suspend it just sends a
suspend-to-ram signal for all VMs via guest ACPI.


-System resume: *GW resumed first (to prevent race-conditions), monit
checks logs, triggers iptables lockdown for enough time until WS
iptables takes over, WSs timesync traffic exempt. timesync is started
for the GW. *WSs resumed later 5-10s later, start iptables lockdown, WS
timesync initiated.

This still leaves room for race conditions. The proper way to implement
this is to lock Whonix-Gateway network on pre-suspend.