Page MenuHomePhabricator

Seamless Clock Adjustment on Whonix Gateway after Suspend-Resume
Closed, InvalidPublic

Description

Facts:

*qemu-guest-agent achieves successful clock syncing of a guest upon host system resume and does not attempt to constantly adjust the clock.

*kvmclock can only sync time during guest startup not during its lifetime:
https://serverfault.com/a/334734

*kvmclock used to sync time after suspend but no longer applies (this
explains my experiences before):
http://thread.gmane.org/gmane.comp.emulators.libvirt/92431

*qemu-guest agent is a solution but unsafe if used in untrusted guests, but is ok for Whonix-Gateway because its trusted:
https://serverfault.com/a/635273
http://wiki.qemu.org/Features/QAPI/GuestAgent

*Its not safe because it relies on Javascript code parser thats still not
hardened enough to run in hostile guest environments. It has to be enabled on the host by
adding a qemu-guest agent channel for it to work - without this it has
no effect and no security implications.

*using qemu-guest-agent is currently stalled because of permissions problems on Jessie. Apparmor workarounds not recommended, could be harmful to security:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1393842

*To have the same functionality for VirtualBox the resume hooks in Guest Additions will be used.


There is a Debian package:
https://packages.debian.org/jessie/qemu-guest-agent

We could add it as a weak dependency, below here:
https://github.com/Whonix/Whonix/blob/979c4393bd5e2d6ae20c690e39bb377d6244809e/build-steps.d/1700_install-packages#L405

Similar to this commit:
https://github.com/Whonix/Whonix/commit/979c4393bd5e2d6ae20c690e39bb377d6244809e


Rejected Solution:

Using Tordate as a coarse clock setting mechanism for Whonix-Gateway for Tor to connect.

It's fingerprintable. (All info/link/quotes on that wiki page.)
(Could be tied directly to Tails or Whonix. Unrelated from local clock leaks.)

Details

Impact
Normal

Event Timeline

HulaHoop updated the task description. (Show Details)
HulaHoop raised the priority of this task from to Normal.
HulaHoop set Impact to Normal.
HulaHoop added a subscriber: HulaHoop.
Patrick updated the task description. (Show Details)Jul 26 2015, 8:45 AM

When do you want to do this? Does this require Debian stretch first?

*qemu-guest agent is a solution but unsafe if used in untrusted guests, but is ok for Whonix-Gateway because its trusted:

*Its not safe because it relies on Javascript code parser thats still not
hardened enough to run in hostile guest environments. It has to be enabled on the host by
adding a qemu-guest agent channel for it to work - without this it has
no effect and no security implications.

So, can you enable this for Whonix-Gateway only? I guess it will need some libvirt XML changes in whonix-libvirt? Something else?

Where is the documentation for the hooks? By hook I mean, that /path/to/some/script gets executed by the hypervisor upon resume. The script would be located inside the VM. Or does this require to add scripts to the host that then run some code within the VM?

Is it possible for a (Qubes) VM to be notified (pre or) post suspend? (Refering to Qubes VM Manger, Pause VM feature.) @marmarek

The use case here would be to dispatch (a) script(s) when that happens (post-suspend, i.e. after resume).

HulaHoop added a comment.EditedJul 27 2015, 10:10 PM

So, can you enable this for Whonix-Gateway only? I guess it will need some libvirt XML changes in whonix-libvirt? Something else?

Yes I can. When upstream is ready an on_resume tag is needed in the xml too, more below.

Where is the documentation for the hooks? By hook I mean, that /path/to/some/script gets executed by the hypervisor upon resume. The script would be located inside the VM. Or does this require to add scripts to the host that then run some code within the VM?

No official documentation yet, mostly mail list patches and discussions. No scripts of any kind will be needed just: qemu-channel for GW, on_resume flag in GW xml, qemu-guest-agent inside guest. That's it.

By Debian Stretch, libvirt upstream should have added an <on_resume> xml option to automatically trigger guest time syncing on resume. Also the Apparmor problem should hopefully be fixed. I tested and its only an Apparmor problem and not missing directories/permissions thing. Setting GW to aa-complain lets it start with guest-agent channel with no errors. You can add the support now if its simple enough or just wait because there is no rush to roll this out now when workarounds and manual interaction are needed.

At the moment its manual:

[1]

Create the missing directories and change group permissions. note this will be necessary unless you reinstall the package in the future as the package update doesn't create the directory:

sudo mkdir -p /var/lib/libvirt/qemu/channel/target
sudo chown -R libvirt-qemu:kvm /var/lib/libvirt/qemu/channel

[2]

Install apparmor-utils:

sudo apt-get install apparmor-utils

Make sure only Whonix-Gateway is running and find out its profile name:

sudo apparmor_status

Set the profile to complain mode. Note that its safe because the Gateway is trusted:

sudo aa-complain /etc/apparmor.d/libvirt/libvirt-73aa79f1-0023-4e37-8001-5be78e88434 (for example)

The vm should start normally now.

[3] [5]

guest]sudo apt-get install qemu-guest-agent
guest]sudo systemctl start qemu-guest-agent.service
host ]sudo virsh domtime Whonix-Gateway --now


Libvirt upstream plans: [4]

This is still a frequently-requested issue on upstream lists; qemu 2.0 made it possible for the guest-agent to do guest-set-time without arguments to have the guest re-read the hardware clock and adjust software time from that, and qemu 2.1 is adding rtc-reset-reinjection to tell qemu when the agent has been used to force guest time and therefore qemu no longer needs to slew clock interrupts. Libvirt 1.2.5 added a virDomainSetTime API to trigger the guest agent command, and future libvirt versions may add a flag to that API to also trigger the rtc-reset-reinjection followup. There is also an idea of adding an <on_resume> action to the domain XML of libvirt to allow libvirt to automatically trigger time reset on resume (it can't be on by default, because it requires guest interaction which is only safe if the guest is trusted; but could be explicitly enabled by management as needed). Of course, as this is still under active work upstream, there is no telling how soon it can be backported downstream, or even if such a backport is feasible without a rebase.


[1] https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1393842/comments/2
[2] https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1407434/comments/6
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1156280#c25
[4] https://bugzilla.redhat.com/show_bug.cgi?id=821988#c7
[5] https://bugzilla.redhat.com/show_bug.cgi?id=1211938#c0

But this won't restart sdwdate?

It depends. Is it possible to trigger sdwdate upon detecting a clock jump on GW? This will need additions to the sdwdate code in the guest of course.

Another far-fetched idea is to have some type of unidirectional signal being sent from GW to WS to (safely) trigger timesync on the WS too without manual intervention there. These events follow from the actions above.

The main idea behind the ticket is for setting an acceptable time so Tor can even connect, but there is no reason it can't be developed more.

First of all we don't have qemu in Qubes.

But we have qrexec services triggered on suspend/resume. Few of them:

  • qubes.SuspendPre - pre suspend
  • qubes.SuspendPost - post resume
  • qubes.SetDateTime - after resume and every 6 minutes

The first two currently are called only for VMs with PCI devices (to
tear down devices before suspend). The last one is called in every VM
with current timestamp in format of date -Iseconds at stdin - exactly
for this purpose - to synchronize time.
AFAIR Whonix does not want to use the last method (perhaps it should be
blocked somehow?). But that mechanism can be repurposed to call swdate.
Or we can enable qubes.SuspendPre and qubes.SuspendPost for every VM.

There is also fourth service: qubes.SyncNtpClock, which is called in
selected VM (aka ClockVM) to synchronize time with "outside world".
Later time from this VM is used to synchronize time in dom0 and every
other VM (using said qubes.SetDateTime). Perhaps when using Whonix it
should be advised to set Whonix gateway as ClockVM, which would use
swdate in qubes.SyncNtpClock service?

In T381#6095, @HulaHoop wrote:

It depends. Is it possible to trigger sdwdate upon detecting a clock jump on GW? This will need additions to the sdwdate code in the guest of course.

Maybe. A script that writes down the current unixtime every 10 seconds. Then compares those. If the difference is bigger than let's say 60 or so, likely a clock jump occurred. (Or the user just manually adjusted the clock.) And sdwdate would have to prevent noticing the clock jumps it causes itself (when setting time using date instead of sclockadj). Might be doable. But sounds like an unclean solution, awful hack to me.

Another far-fetched idea is to have some type of unidirectional signal being sent from GW to WS to (safely) trigger timesync on the WS too without manual intervention there. These events follow from the actions above.

Would probably be only required for KVM due to the host security problem. Having a service listening just for that... I don't know. Perhaps KVM images should be separate builds then.

The main idea behind the ticket is for setting an acceptable time so Tor can even connect, but there is no reason it can't be developed more.

What you're suggesting at the moment (virsh domtime) would directly conflict with sdwdate. KVM would set the clock to the host's clock (what we don't want - or not -> different discussion) and sdwdate's sclockadj might still be running unaffected by this. Not good.

Wondering, if there are virtualizer unspecific hooks.
virtualizer (Xen) unspecific hooks pre-suspend / post-suspend? pm-utils?:
https://groups.google.com/forum/#!topic/qubes-devel/Idct7OthKc0

In T381#6096, @marmarek wrote:

But we have qrexec services triggered on suspend/resume. Few of them:

  • qubes.SuspendPre - pre suspend
  • qubes.SuspendPost - post resume
  • qubes.SetDateTime - after resume and every 6 minutes

    The first two currently are called only for VMs with PCI devices (to tear down devices before suspend).

The last one is called in every VM
with current timestamp in format of date -Iseconds at stdin - exactly
for this purpose - to synchronize time.
AFAIR Whonix does not want to use the last method (perhaps it should be
blocked somehow?).

Yes. That is a security issue. Created T384 for it.

But that mechanism can be repurposed to call sdwdate.

I think for now "after resume and every 6 minutes" (qubes.SetDateTime) doesn't play well together with sdwdate. It keeps running after boot. Just a notification after resume is required.

> Or we can enable qubes.SuspendPre and qubes.SuspendPost for every VM.

Yes, unless there aren't any virtualizer agnostic hooks
(https://groups.google.com/forum/#!topic/qubes-devel/Idct7OthKc0)
this would be great.

There is also fourth service: qubes.SyncNtpClock, which is called in
selected VM (aka ClockVM) to synchronize time with "outside world".
Later time from this VM is used to synchronize time in dom0 and every
other VM (using said qubes.SetDateTime). Perhaps when using Whonix it
should be advised to set Whonix gateway as ClockVM, which would use
swdate in qubes.SyncNtpClock service?

The design goal is, that the host's [any VMs], Whonix-Gateway and any Whonix-Workstation's clock should slightly differ.
(Rationale: Prevent adversaries from linking anonymous and pseudonymous activity. Described in more detail on the Dev/TimeSync wiki page.)

HulaHoop added a comment.EditedJul 28 2015, 8:55 PM

Wondering, if there are virtualizer unspecific hooks.

Almost certainly no.

That kind of information sharing between guest and host will always need some hypervisor specific mechanism to happen. For KVM its the guest agent for Qubes implements it as a part of its integration protocol and so on.

What you're suggesting at the moment (virsh domtime) would directly conflict with sdwdate. KVM would set the clock to the host's clock (what we don't want - or not -> different discussion) and sdwdate's sclockadj might still be running unaffected by this. Not good.

I'm not suggesting that Whonix supports that feature before automatic on-resume time syncing lands in libvirt. The virsh domtime command is manual and the workarounds needed to make this work manually now makes it unpractical to use. I documented it for archive purposes and do not recommend it for now.

Might be doable. But sounds like an unclean solution, awful hack to me.

On WS you can have the already running rinetd listening for signals. On GW you can use the cpfpd infrastructure to relay the unidirectional signal out.

If you think the automated signalling to the WS is too ugly then write it off. At the moment timesync needs to be run manually on WS anyway (and works only when GW s connected to Tor).

The 'awful hack' statement applied only to what I quoted: sdwdate clock jump detection.

I don't think overloading cpfpy is the right place for adding a gw -> ws command protocol. It's doing one thing well. Such a protocol would require and additional server listening on the ws. That can't be cpfpy, because that's listening on the gw. And this protocol would require something sending commands to the ws from the gw. I hope this can be avoided. Getting this right is difficult. Also because then the ws would also have to authenticate, that commands are really coming from the gateway. (multiple ws's behind the same gw in the same internal network issue for non-Qubes-Whonix (documented)). If we needed a protocol between gw and ws, I would be inclined to first check if the Qubes specific qrexec would work well for that task.

The 'awful hack' statement applied only to what I quoted: sdwdate clock jump detection.

I see. Nevermind the rest of the post. I was trying to figure out ways you didn't ask for.

If we needed a protocol between gw and ws, I would be inclined to first check if the Qubes specific qrexec would work well for that task.

OK but the solution should potentially by hypervisor agnostic.

Sure. But I already explained why this is [too] difficult.

This ticket is deprecated by https://phabricator.whonix.org/T385

closing.

HulaHoop closed this task as Resolved.Jul 29 2015, 8:46 PM
In T381#6110, @Patrick wrote:
In T381#6096, @marmarek wrote:

There is also fourth service: qubes.SyncNtpClock, which is called in
selected VM (aka ClockVM) to synchronize time with "outside world".
Later time from this VM is used to synchronize time in dom0 and every
other VM (using said qubes.SetDateTime). Perhaps when using Whonix it
should be advised to set Whonix gateway as ClockVM, which would use
swdate in qubes.SyncNtpClock service?

The design goal is, that the host's [any VMs], Whonix-Gateway and any Whonix-Workstation's clock should slightly differ.
(Rationale: Prevent adversaries from linking anonymous and pseudonymous activity. Described in more detail on the Dev/TimeSync wiki page.)

I forgot adding, I think therefore Whonix-Gateway should not be the ClockVM for all other VMs. At least not "directly".

Apart from this... Generally... The idea is good. Reusing Whonix-Gateway and sdwdate. Having the time securely provided by sdwdate. (Since sdwdate depends on Tor, Tor traffic and Tor configuration...) Having a second instance of sdwdate running within Whonix-Gateway that provides time for dom0 and all non-Whonix VMs would be an improvement against time related attacks. Better than NTP. For those who are willing to use Tor.

In T381#6150, @Patrick wrote:
In T381#6110, @Patrick wrote:
In T381#6096, @marmarek wrote:

There is also fourth service: qubes.SyncNtpClock, which is called in
selected VM (aka ClockVM) to synchronize time with "outside world".
Later time from this VM is used to synchronize time in dom0 and every
other VM (using said qubes.SetDateTime). Perhaps when using Whonix it
should be advised to set Whonix gateway as ClockVM, which would use
swdate in qubes.SyncNtpClock service?

The design goal is, that the host's [any VMs], Whonix-Gateway and any Whonix-Workstation's clock should slightly differ.
(Rationale: Prevent adversaries from linking anonymous and pseudonymous activity. Described in more detail on the Dev/TimeSync wiki page.)

I forgot adding, I think therefore Whonix-Gateway should not be the ClockVM for all other VMs. At least not "directly".

Apart from this... Generally... The idea is good. Reusing Whonix-Gateway and sdwdate. Having the time securely provided by sdwdate. (Since sdwdate depends on Tor, Tor traffic and Tor configuration...) Having a second instance of sdwdate running within Whonix-Gateway that provides time for dom0 and all non-Whonix VMs would be an improvement against time related attacks. Better than NTP. For those who are willing to use Tor.

Created T387 for it.

Patrick changed the task status from Resolved to Invalid.Jul 30 2015, 9:39 PM
Patrick claimed this task.

'resolved' is confusing here. Not really resolved as in implemented. New ticket: T385 But we might re-consider this one.

Patrick removed Patrick as the assignee of this task.Jul 30 2015, 9:39 PM
HulaHoop reopened this task as Open.Aug 4 2015, 5:53 PM
HulaHoop added a comment.EditedAug 4 2015, 5:56 PM

I'm ok with this ticket being implemented minimally rather than waiting for than abandoning it because of things hard to implement. By minimal I mean the timer loop checker on GW that triggers a clock sync and starts Tor before running timesync if it detects a large delta between what qemu-guest-agent says and the current host clock.

There is no need to implement most of this "watch out for clock jumps and dispatch hooks when detected" code directly in sdwdate. Can be implemented as an independent generic package.

HulaHoop closed this task as Resolved.Dec 16 2015, 10:42 PM
HulaHoop claimed this task.
Patrick changed the task status from Resolved to Invalid.Jun 2 2016, 1:35 PM