Page MenuHomePhabricator

cpfp.py hangs on Tor Browser 4.5a[3-4] new requests
Closed, ResolvedPublic

Description

When using cpfp-python in place of cpfp-bash, after filtering the three requests (authenticate "password", setevents stream, getinfo config-text), cpfp.py stops passing subsequent requests, including SIGNAL NEWNYM, as long as Tor Browser is running.

The same happens when CONTROL_PORT_FILTER_DISABLE_FILTERING is set to true.

Already tried a lot on that problem. @Patrick Could you try to reproduce the bug? I'd like to make sure it's not a specific issue on my side

Event Timeline

troubadour claimed this task.
troubadour raised the priority of this task from to Needs Triage.
troubadour updated the task description. (Show Details)
troubadour added subscribers: troubadour, Patrick.

Looks like you are working on cool stuff. :)


Yes, there is some bug going on here.


Developer toolbox:

Normal start command.

sudo bash -x /etc/init.d/control-port-filter-python start

Stop command.

sudo bash -x /etc/init.d/control-port-filter-python stop

If the stop command goes fast, almost instantly, then the process could be killed using sigterm. If it takes "long" (~5 seconds), then cpfpy did not respond to sigterm, and the process is killed using sigkill. The latter is bad. On new startup attempts, you might see in the log.

CRITICAL:19927:Server error [Errno 98] Address already in use

To find any eventual leftovers, stale processes, try this.

ps aux | grep cpf

or

ps aux | grep python

Sometimes I was not able to start a new instance even if ps aux didn't show an old error. Same address already in use error. That baffles me.


Will post more soon.

More stuff the the dev toolbox:

On the gateway, talk to cpfpy directly.

nc 10.152.152.10 9052

To check if the port is free, if cpfpy really has shut down.

sudo lsof -i :9052
COMMAND   PID       USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
cpfp.py 27653 debian-tor    4u  IPv4  73153      0t0  TCP 10.152.152.10:9052 (LISTEN)

Issue:

The sigterm trap doesn't really work.

kill -sigterm 24140

Example from log that I've just seen:

INFO:24140:Configuration read from "/etc/cpfpy.d/30_controlportfilt_default"
INFO:24140:Configuration read from "/etc/cpfpy.d/50_user"
INFO:24140:Trying to start Tor control port filter on IP 10.152.152.10 port 9052
INFO:24140:Tor control port filter started, listening on IP 10.152.152.10 port 9052
INFO:24140:Request: SIGNAL NEWNYM
INFO:24140:Answer : 250 OK
INFO:24140:Request: xxx
INFO:24140:Answer : 510 Unrecognized command "xxx"
INFO:24140:Signal sigterm received. Exiting.
INFO:24140:Request: GETINFO status/bootstrap-phase
INFO:24140:Answer : 250-status/bootstrap-phase=NOTICE BOOTSTRAP PROGRESS=100 TAG=done SUMMARY="Done"

It says exiting, but then doesn't really exit. It does on the the requests, and then when you send another sigterm, then it actually does exit. Could all be somewhat related to this bug. Perhaps the first sigterm only closes the current still open connection? And the next sigterm really shuts it down?


I couldn't reproduce the original bug you reported here. Perhaps what I have reported is still somewhat connected. I don't think it's Tor Browser specific. Perhaps try to connect to it using nc twice using two Konsole tabs.

Patrick triaged this task as Normal priority.Mar 5 2015, 9:55 PM

I believe what you report is connected, and it's not Tor specific because ther is no issue with cpfp-bash.

Before I go deeper following your tests:
When saying cpfp.py hangs, it's not true. The process is stil running, but not accepting any request. (to monitor it, in another terminal I use htop -> F4 (Filter) -> cpf).

When running

sudo service control-port-filter restart

the process stops but does not restart, because of

CRITICAL:12387:Server error [Errno 98] Address already in use

It takes ~1 minute waiting (trying to restart it or not) and it starts again.

So far, the whole thing baffles me too. Have replaced the TCP server by a socket in a while True loop, like in the Tails original script, to no avail.

Now I am sure of it. You cannot have two connections of nc (or anything) talking to cpfpy at the same time. I think this is one of the two original limitations. [1)Only one connection at a time; 2) connections to source aren't help open until other side closes them, not listening to what Tor continues to output, therefore subscribing to events unsupported] [Tails is also affected by those, I think.]

T221 is super simple and will help us debugging this a bit. (Because then 'sudo service control-port-filter-python start/restart' will tell us if that actually failed.)

I thought python would clean up after us. I thought after exiting the script, all listens would be automatically closed. That might not be the case? Did some searching... Perhaps we have to cleanly shut down the server using python code before we exit? (That would be a bit cleaner anyhow.)

Just saw, there is a comment # Accept parallel connections.. Doesn't seem to actually work anymore.

The problem of this thread seems to be, that Tor Browser is keeping the connection open and that multiple connections are broken at the moment. Maybe Tor Browser is using multiple connections. Wouldn't wonder about that.

The problem of this thread seems to be, that Tor Browser is keeping the connection open and that multiple connections are broken at the moment. Maybe Tor Browser is using multiple connections. Wouldn't wonder about that.

That's what seems to happen, Tor Browser keeping the connection(s) open.

I thought that using a TCP server instead of a socket would solve the concurrent connections issue. It does not and the comment in the script is false.

The obvious path was to try threading the process, with all the alledged performance and error penalties. Fortunately, there is a neat solution: gevent gives you threads, but without using threads.

In the following commit, cpfp.py handles simultaneous connections. Tested with nc, telnet, whonixcheck and Tor Browser (open/close, new identity) at the same time. The number of concurrent connections is limited to 5, modifiable in

server = StreamServer((IP, PORT), handle, spawn=5)

https://github.com/troubadoour/control-port-filter-python/commit/9447486676f8720e444b330b28dbf2e40d3e5362

We are left with the way python handles the clean up after receiving the terminate signal. At the moment, with

sudo service control-port-filter stop

or

sudo bash -x /etc/init.d/control-port-filter start

the process really stops (it takes ~2 seconds), but nothing is logged. Looking into it.
The latter is looking for cpf-tcpserver (pgrep) but that should not be an issue.

Will test now.

/etc/init.d/control-port-filter

Please don't use that anymore. I recommend:

sudo update-rc.d control-port-filter remove

And.

sudo rm /etc/init.d/control-port-filter

/etc/init.d/control-port-filter-python and/or control-port-filter-python instead please.

Tested. Merged. Works for me.

/etc/init.d/control-port-filter-python and/or control-port-filter-python instead please.

Yes, done. Removed control-port-filter and installed control-port-filter-python. Had to modify whonixcheck/check_control_port_filter with /var/run/control-port-filter-python/pid.

cpfp.py exits and restarts normally (no 2 seconds delay), stopping the server and logging the signal.
https://github.com/troubadoour/control-port-filter-python/commit/ef69916111a4c24780623b0dd6be42c2f88796da

I can only keep repeating myself. :)
Tested. Merged. Works for me.