Page MenuHomePhabricator

iptables block network access until sdwdate succeeded
Open, NormalPublic

Description

Iptables block network access until sdwdate succeeded. Reasons:

  • cover cases where sdwdate is slow or failing
  • catch race conditions where sdwdate is slower than a user starting a client program, server or daemon that already issued network traffic and leaked the time

Previously this was implemented in form of the timesync progress bar. But such a progress bar was bad for various reasons:

  • not enforced, easily ignored
  • does not stop automatically starting applications and/or the user from using the network
  • a popup which is bad for usability
  • two or more [when using multiple Whonix-Workstation's] on the same desktop when using Qubes [due to its nature of using seamless mode]

A follow up task of T300.

Implementation:

  • after boot whonix-gw-firewall / whonix-ws-firewall should block the network for everything but Tor and sdwdate
    • should create a /var/run/whonix_firewall/first_run_current_boot.status file
    • when Whonix firewall gets restarted and /var/run/whonix_firewall/first_run_current_boot.status already exists, it should unblock the network and create a status file /var/run/whonix_firewall/consecutive_run.status.
  • after the first time synchronization succeeded, sdwdate should issue unlocking the network
    • sdwdate already creates a status file /var/run/sdwdate/first_success, then
    • reload whonix_firewall
  • enabled by default
  • configuration options to disable all of this
  • all of this should safeguard allowing the user to allow network access even if one day a case is met where sdwdate is permanently failing
  • sdwdate-gui should shows that status of network time synchronization

Testing:

sudo rm /var/run/sdwdate/* && sudo service sdwdate restart && sudo service tor restart && whonixcheck_tor_bootstrap_wait_max=10 whonixcheck --gui --cli

Details

Impact
Normal

Event Timeline

Patrick created this task.Aug 3 2016, 5:56 PM
Patrick updated the task description. (Show Details)Aug 23 2016, 6:57 PM
Patrick updated the task description. (Show Details)

I like the idea but how do you plan to tackle the case when a user resumes a guest from sleep?

A clock jump detection script that looks at hwclock maybe?

Thank you for participating in this one! I can really use some input here.

I like the idea but how do you plan to tackle the case when a user resumes a guest from sleep?

It's a good question. Please create a separate ticket for that since this is is very hard already. (At least looks like atm.)

A clock jump detection script that looks at hwclock maybe?

Restarting sdwdate is done for Qubes-Whonix. References:

I am not sure we have a ticket for Non-Qubes-Whonix clock jump detection. Probably not. We discussed this in T381 and T385. Non-Qubes-Whonix sdwdate restart on suspend / clock jump detection would be useful independently from this ticket also.


Designing the overall interaction of this is not simple. Working at that atm. I was considering to introduce two new concepts:

  • Whonix firewall in restricted mode
    • the mode after booting Whonix
    • where only sdwdate and whonixcheck [local-only] Tor bootstrap test are allowed
    • hidden services blocked
    • any other client applications blocked, including apt-get
  • Whonix firewall in full mode
    • the mode on Whonix firewall restart (?)
    • the mode on sdwdate restart (?)
    • where all torified connections are allowed as usual
    • like in Whonix 13

Any suggestions on the concept names and concepts?

I am wondering on how to integrate this with whonixcheck Tor bootstrap test. (i.e. the passive popups "Connecting to Tor...", "Connected to Tor.")

  • keeping sdwdate unspecific to Whonix and Whonix firewall
  • sdwdate-gui vs whonixcheck Tor bootstrap test
  • keeping it simple for the user

I guess in Whonix 14 whonixcheck, user notification "Connecting to Tor..." should technically mean "Tor bootstrap not finished yet, in progress AND sdwdate in progress".... As well as user notification "Connected to Tor." should technically mean "Tor bootstrap succeeded AND sdwdate time synchronization succeeded"...?

For Non-Qubes-Whonix it is a bit simpler, since we have separate passive popups for gateway in workstation in two separate contained VM windows.

For Qubes-Whonix it looks weird to have the gateway passive popup say "Connected to Tor." while the workstation sdwdate synchronization may not be done yet, therefore workstation networking may still be locked. And in case sdwdate fails on the workstation, network locked "forever" without manual user interaction. sdwdate-gui for Qubes, i.e. T534 would solve this, but that is a python gui task, I am not sure I can finish for Whonix 14.

Patrick updated the task description. (Show Details)Aug 24 2016, 9:52 PM

clock jump detection would be useful independently from this ticket also.

OK. I'll look around to see if this is already implemented in some package we can reuse. If not I'll try finding more about the best way to do this.

Any suggestions on the concept names and concepts?

I was thinking we can call them

whonix firewall: timesync fail closed mode & fail open mode - its more specific than restricted and full which can imply many other things. I like the concepts and how they work.

I think its best to display simplified progress dialogs like "Tor is connecting" is preferable to "Tor is bootstrapping". However we should still keep the sdwdate dialogs separate from Tor ones to not confuse a user when reporting a bug - mistaking sdwdate sync failure for a Tor one. I definitely agree with keeping sdwdate unspecific to Whonix and Whonix firewall because it belongs as project in its own right.

In case sdwdate fails hard we should display how to turn off the connection restrictions in a dialog. This is important where an adversary can somehow root some of the servers on the sdwdate and refuse to allow sync requests to succeed - this can be used to DoS the entire Whonix user base. So we want to make sure that users have an easy way to override this - because they won't be able to connect to search the problem or ask for help.

Instead of monitoring the clock for changes we can assume that an interrupted Tor connection was caused by suspend event that initiates syncing. Is the tearing down of stale circuits when waking up the machine detectable in Tor's log? Can this be checked via a controlport event?

- separate whonixcheck help message when Tor bootstrap succeeded but timesync failed
- avoid too technical word "bootstrap"
- output
- comments

https://phabricator.whonix.org/T533

https://github.com/Whonix/whonixcheck/commit/7ab9126eb76e5ee6fc78971d1e96e40cc5265994


whonix firewall: timesync fail closed mode & fail open mode - its more specific than restricted and full which can imply many other things. I like the concepts and how they work.

Ok.

I think its best to display simplified progress dialogs like "Tor is connecting" is preferable to "Tor is bootstrapping".

Doing that already with passive popups. "Connecting to Tor..." "Connected to Tor." With above commit I totally removed bootstrapping from any output.

However we should still keep the sdwdate dialogs separate from Tor ones to not confuse a user when reporting a bug - mistaking sdwdate sync failure for a Tor one.

Done with above commit.

I definitely agree with keeping sdwdate unspecific to Whonix and Whonix firewall because it belongs as project in its own right.

Yes, fortunately I was able to keep it that way.

In case sdwdate fails hard we should display how to turn off the connection restrictions in a dialog. This is important where an adversary can somehow root some of the servers on the sdwdate and refuse to allow sync requests to succeed - this can be used to DoS the entire Whonix user base. So we want to make sure that users have an easy way to override this - because they won't be able to connect to search the problem or ask for help.

Another way to phrase is this "To have all Whonix users give up on Whonix's time related deanonymization protections, we just need to take down enough of its time sources." We'll find a better way. Explaining users how to manually set the time.

The following message is the whonixcheck error active popup message in case initial Tor bootstrap fails after boot. (The markup is not visible, but not handy to create a screenshot.)

ERROR: Time Synchronization Result: 
Whonixcheck gave up waiting. 

Time synchronization status: pending 
sdwdate reports: Maximum allowed number of failures reached in pool 1 (4 of 14). Giving up.
If the problem occurs too frequently, please report it.

Sleeping for 10 minutes. 

Possible issues: 
- sdwdate will need a few more moments for fetching the time. 
- sdwdate time sources might be dysfunctional. 

Recommendations: 

A) Rerun whonixcheck: 
dom0 -> Start Menu -> ServiceVM: sys-whonix -> Whonix Check
or in Terminal: whonixcheck 
or in Terminal with debugging: whonixcheck --debug --verbose --gui --cli 

B) Restart sdwdate. 
dom0 -> Start Menu -> ServiceVM: sys-whonix -> sdwdate-gui -> right click on sdwdate-gui systray -> Restart sdwate - instantly adjust the time
or in Terminal: sudo service sdwdate restart 

C) Manually set the time. 

As last resort... 

1. Open a terminal. (dom0 -> Start Menu -> ServiceVM: sys-whonix -> Terminal) 
2. Set the clock to the correct time in UTC. (Example.) 
sudo date --set "Mon Aug 29 21:43:23 UTC 2016"
3. Simulate sdwdate success. 
sudo touch /var/run/sdwdate/first_success
4. Rerun whonixcheck.

Still need to somehow explain, that the time should slightly differ for host/gw/ws.

Maybe instruct them to:

-> "Set the clock close to but not exactly the correct time in UTC. +/- a minute."

It's a bit more difficult.

Related:

Humans are not good at choosing random values. Speficially I doubt someone would take 0, 1 or 2 or so. With not too high effort, I think this can be made safer and more usable by providing a script for manually setting the clock.

OK. Do you suggest a simple sdwdate input box for them to put their current time in, then it applies the offset range we think is safe before setting the guest time?

Added to bootclockrandomization package. Non-ideal, but less overhead (no additional package just for this) and more code can be reused.

  • bootclockrandomization uses at boot a +/- 180 second offset.
  • Manual clock randomization (for sdwdate permanent failure cases) where the user (hopefully) enters a reasonable correct clock uses a +/- 30 seconds offset to emulate being closer to sdwdate accuracy.
Patrick updated the task description. (Show Details)Sep 5 2016, 4:20 PM

restricted mode -> timesync-fail-closed mode

Should we still go for full mode -> timesync-fail-open mode? I think that sounds a bit strange. An alternative to 'full' would be 'normal' or 'regular' or something like that.

Up to you but I still think timesync-fail-open sounds more technically descriptive from a dev POV than using normal/regular. That isn't a problem because regular users should not even know about it.

On Whonix-Gateway:

Gui applications such as kwrite can currently not be opened in Whonix firewall timesync-fail-closed mode because also localhost connections are blocked.

But if localhost connections are allowed, then apt-get -> localhost -> Tor -> debian-tor user is allowed to connect anywhere -> connection succeeds. This is a failure, since we are in timesync-fail-closed mode.

Any idea how to elegantly fix this?

Network shouldn't be needed for GUI applications as long as DISPLAY
environment variable is correctly set. Make sure it's :0, and not
localhost:0.

Only kwrite does not work without localhost access. Strange.

echo "$DISPLAY"
:0
IN= OUT=lo SRC=127.0.0.1 DST=127.0.0.1 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=3088 DF PROTO=TCP SPT=52713 DPT=111 WINDOW=43690 RES=0x00 SYN URGP=0

However, leafpad works. Another reason to move away from KDE.

Are any other issues to be expected without localhost access?

I'd expect some more problems, but nothing serious. For example CUPS may
not work...

Patrick changed the task status from Open to Review.Sep 16 2016, 4:54 PM
Patrick changed the task status from Review to Open.Dec 16 2016, 5:48 PM

Blocking outgoing connections to 127.0.0.1 in timesync-fail-closed mode creates massive issues. For example konsole starts but then is unresponsive (frozen) due to the blocked localhost tcp packages. (And since we'll stay with kwrite.) A solution needs to be found.

temporary workaround for 127.0.0.1 connections issues to unbreak developers repository:

The current workaround (to unbreak Whonix developers repository) allowing full outgoing access to 127.0.0.1 is as bad as not implementing this ticket. (One could run apt-get update which results in uwt apt-get update connecting to 127.0.0.1, where Tor would accept it.)

Rather than blocking all connections to 127.0.0.1 and therefore breaking konsole, kwrite among other things we could blacklist all connections to ports included in socks_ports_list (i.e. block connections to all Tor SocksPorts).


Another issue is, that whonix-gw / whonix-ws TemplateVM's qubes-whonix-torified-updates-proxy-check.service fails as long as sys-whonix is still in firewall_mode=timesync-fail-closed mode. Any idea how to solve that?

The TemplateVM is up (and runs qubes-whonix-torified-updates-proxy-check.service) faster than sdwdate succeeded on sys-whonix [not accepting traffic yet].

One rather hacky way might be qubes-whonix-torified-updates-proxy-check.service using qrexec communicating with sys-whonix and to waiting until sys-whonix is in firewall_mode=full mode.

Alternatively, is there a way to delay TemplateVM start until sys-whonix signals "ready"?

What about retrying qubes-whonix-torified-updates-proxy-check.service on
connection failure?

That's a good idea.

First I thought allowing incoming traffic on Whonix-Workstation in timesync-fail-closed mode would be okay, since outgoing traffic would be blocked. On a second thought, it would not be useful if a hidden service was reachable but the backend server could not reply (still blocked in timesync-fail-closed mode). So...

TODO:

Patrick changed the task status from Open to Review.Dec 25 2016, 3:52 AM

Note to self: try to disable and see if konsole and kwrite are still functional in timesync-fail-closed mode.

## TODO: temporary - https://phabricator.whonix.org/T533#10288
$iptables_cmd -A OUTPUT -m iprange --dst-range "127.0.0.1" -j ACCEPT

https://github.com/Whonix/whonix-ws-firewall/blob/master/usr/bin/whonix_firewall#L318

Patrick changed the task status from Review to Open.Dec 21 2017, 5:55 PM
Patrick edited projects, added Whonix 15; removed Whonix 14.
In T533#13328, @Patrick wrote:

Note to self: try to disable and see if konsole and kwrite are still functional in timesync-fail-closed mode.

## TODO: temporary - https://phabricator.whonix.org/T533#10288
$iptables_cmd -A OUTPUT -m iprange --dst-range "127.0.0.1" -j ACCEPT

https://github.com/Whonix/whonix-ws-firewall/blob/master/usr/bin/whonix_firewall#L318

Done.

https://github.com/Whonix/whonix-ws-firewall/commit/af8dc373e301060de48454ceba7f42dcbd1c5e92