Discussion:
[Twisted-Python] twisted listening on UDP port, why?
Glyph
2016-04-20 00:01:22 UTC
Permalink
I'm trying to lock down a box, and came across a peculiarity with a twisted dameon -- it's binding to 0.0.0.0 for UDP on an arbitrary port
I can'f figure out why this is happening -- I'm not consciously/explicitly using anything on UDP, and the port changes every time I start up a daemon.
Does anyone have a clue what it could be?
Perhaps this is libc's DNS client? Twisted doesn't do anything like this.

-glyph
Phil Mayers
2016-04-20 11:33:44 UTC
Permalink
Post by Glyph
Perhaps this is libc's DNS client? Twisted doesn't do anything like this.
It does something similar with win32reactor IIRC?

http://twistedmatrix.com/trac/browser/tags/releases/twisted-16.1.1/twisted/internet/posixbase.py#L60

...but that binds to 127.0.0.1 not 0.0.0.0

glibc doesn't hold it's DNS sockets open AFAIK - it closes them once the
reply is done.
Jean-Paul Calderone
2016-04-20 13:22:58 UTC
Permalink
Hi,

What do the logs for the app say? Twisted logs a message when it binds a
UDP port.

Or, another though, you could put a breakpoint on listenUDP (or socket.bind
or something) and then run the process under pdb and look at the stack
trace.

You could also try sending some traffic to the port and see what happens.
:) Maybe you'll get something back that identifies it or maybe you'll
provoke some more logging code somewhere.

Jean-Paul
Post by Glyph
Perhaps this is libc's DNS client? Twisted doesn't do anything like this.
It does something similar with win32reactor IIRC?
http://twistedmatrix.com/trac/browser/tags/releases/twisted-16.1.1/twisted/internet/posixbase.py#L60
...but that binds to 127.0.0.1 not 0.0.0.0
glibc doesn't hold it's DNS sockets open AFAIK - it closes them once the
reply is done.
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Glyph
2016-04-21 01:15:55 UTC
Permalink
A specific library was keeping the port open. I'm tracking down how/why right now.
So this was fun <sarcasm> thing to learn...
An undocumented (yay) feature of python appears to be... python binds to a random port on all interfaces (0.0.0.0) once you send UDP data through it. I assume this is to allow for a response to come back.
This isn't so much a feature of Python as it is a feature of the BSD sockets API. Sending traffic through a socket, whether it's TCP or UDP, has to bind a client port. Given the nature of UDP, binding on all interfaces is the expectation unless you specify.

I didn't have time to test a simple C program before sending this message, but https://github.com/python/cpython/blob/master/Modules/socketmodule.c <https://github.com/python/cpython/blob/master/Modules/socketmodule.c> only calls "bind()" from sock_bind, not from send(), nor does https://github.com/python/cpython/blob/master/Lib/socket.py <https://github.com/python/cpython/blob/master/Lib/socket.py> engage in any such shenanigans.
We're using statsd for metrics in our twisted daemon and detect issues post-deployment.
If you haven't used it, it's a node.js daemon from etsy that collects udp data and pipes it into python's graphite/carbon libraries. Then you get fancy graphics.
There's also a Twisted version :) https://pypi.python.org/pypi/txStatsD <https://pypi.python.org/pypi/txStatsD>

txStatsD contains both server and client, so maybe you want to use that client if you want better control over the UDP port.
# this does nothing...
sock = socket.socket(family, socket.SOCK_DGRAM)
# but this binds to 0.0.0.0
sock.sendto(data.encode('ascii'), addr)
Sending data to the stats collector on 127.0.0.1:8125 inherently made python bind to 0.0.0.0, and on a port that seems to be in the 40000-60000 range.
That range is the ephemeral client port range <https://en.wikipedia.org/wiki/Ephemeral_port <https://en.wikipedia.org/wiki/Ephemeral_port>> so that's what would be expected of an implicitly-bound socket.
Since a socket to the stats collector is only created once for the process, Python holds that open the entire time.
If it needs to send UDP traffic, it needs to be able to receive UDP traffic as well. You can bind it to a more specific interface, but you can't prevent the port from opening to receive traffic.

-glyph
Phil Mayers
2016-04-23 08:52:35 UTC
Permalink
Thanks for all this.
Post by Glyph
This isn't so much a feature of Python as it is a feature of the BSD
sockets API. Sending traffic through a socket, whether it's TCP or
UDP, has to bind a client port. Given the nature of UDP, binding on
all interfaces is the expectation unless you specify.
I didn't have time to test a simple C program before sending this
message, but
https://github.com/python/cpython/blob/master/Modules/socketmodule.c only
calls "bind()" from sock_bind, not from send(), nor does
https://github.com/python/cpython/blob/master/Lib/socket.py engage in
any such shenanigans.
From what I could tell, the actual communication and binding happens
somewhere in the c module.
Not so. It's down inside the kernel. All applications using the socket
API in this way will display this behaviour, regardless of language.

Seriously, try it and see:

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/ip.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char* argv[]) {
int s,r;
struct sockaddr_in dst;
dst.sin_family = AF_INET;
dst.sin_port = htons(37);
dst.sin_addr.s_addr = INADDR_LOOPBACK;

s = socket(AF_INET, SOCK_DGRAM, 0);
printf("socket created\n");
sleep(30);
sendto(s, "foo", 3, 0, &dst, sizeof(dst));
printf("socket used\n");
sleep(30);
return 0;
}

Compile & run the program and quickly lsof the process, you'll see:

test 16258 pjm3 3u sock 0,8 0t0 87111053 protocol: UDP

...wait until it has printed that it has used the socket, repeat and
you'll see:

test 16258 pjm3 3u IPv4 87111053 0t0 UDP *:51669

As glyph says, this is an inherent feature of the socket API. When you
create a socket, it is unbound because you might be about to call bind()
yourself.

If you then use it without binding it, the kernel has to allocate a
source port, and in turn an interface, and the only sensible choice
absent any instructions from userland is INADDR_ANY.

This is definitely not Python doing this.

Continue reading on narkive:
Loading...