Discussion:
[Twisted-Python] Endpoint composition syntax
Kevin Conway
2016-09-07 02:11:33 UTC
Permalink
I'm not opposed to a fresh syntax, but I do believe the current
implementation can be used for composition. The parser for endpoint strings
is simplistic, like Glyph points out, but there is nothing preventing it
from having nested endpoint definitions. We used the existing syntax when
writing the HAProxy endpoint wrapper:
https://github.com/twisted/twisted/blob/trunk/src/twisted/protocols/haproxy/_parser.py
.

Granted, this case doesn't come with any configuration options but it shows
a potential path for adding wrapping functionality in the current
implementation. I think are some downsides when it comes to args, kwargs
management. To support them across multiple, arbitrary nested endpoints the
kwargs would need to have non-colliding names and the wrapper would need to
do some amount of introspection on args to determine if the leading values
are for it or another endpoint.

I don't think the developer experience would be a good one, but there _is_
a way to compose endpoints if you're set on doing so.
Currently there is no way to explicitly compose Twisted endpoints, but
several endpoint implementations have arisen that explicitly wrap another
endpoint, and so have needed a way to do this. So far, this has been
implementing by passing in an endpoint description, and then calling
serverFromString/clientFromString internally in the endpoint to construct
the wrapped endpoint. I've seen two different ways of encoding the "inner"
1. We may want a syntax that supports composing multiple endpoints,
not just 2.
2. The existing syntax is kind of crummy; ":" as a separator has
serious problems, considering its presence in both URLs and IPv6 literals.
I wouldn't say we should *necessarily* re-design the whole syntax to
accommodate this, but just having a whole new syntax might not be a bad
thing either.
-glyph
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Kevin Conway
2016-09-07 13:21:36 UTC
Permalink
However, it’s worth highlighting that endpoints are inches away from
being a really powerful composable tool for saying “tunnel this protocol
over this other protocol”.

I'm not sure if this is the same concern as the OP. What you've described
is mixing the ideas of composing protocols and composing transports (which
may be inherent to endpoints). TCP, UDP, and UNIX socks are transport layer
choices and adapting them to each other requires opening new file
descriptors. Protocols are transport agnostic and can already be infinitely
composed if implemented as well behaved protocol wrappers.
Currently there is no way to explicitly compose Twisted endpoints, but
several endpoint implementations have arisen that explicitly wrap another
endpoint, and so have needed a way to do this. So far, this has been
implementing by passing in an endpoint description, and then calling
serverFromString/clientFromString internally in the endpoint to construct
the wrapped endpoint. I've seen two different ways of encoding the "inner"
1. We may want a syntax that supports composing multiple endpoints,
not just 2.
2. The existing syntax is kind of crummy; ":" as a separator has
serious problems, considering its presence in both URLs and IPv6 literals.
I wouldn't say we should *necessarily* re-design the whole syntax to
accommodate this, but just having a whole new syntax might not be a bad
thing either.
I mentioned this casually to Tristan in IRC, but the current syntax and
use of endpoints in Twisted gets close to a quite profound idea about
protocol nesting that is lurking in the space of convention. Extending the
endpoint syntax to have a blessed way of essentially composing endpoints
together gives the potential of using the endpoint syntax to design and
implement various “tunneling” features that are very useful.
If we take the arrow syntax, for a moment, you could conceive a truly
insane client wanting to write an endpoint to run FTP over that’s a bit
tcp:host=someftp.server:port=21->http:verb=connect->tcp:->socks5:targetname=mypersonalhttpproxy.server->tcp:host=mycorporatesocksproxy.server:port=2121
This would represent tunneling FTP over TCP over HTTP over TCP over SOCKS
over TCP. For extra fun you can throw in some Tor.
Alternatively, and quite a bit more realistically, you could have a
userspace SCTP implementation that supports being tunnelled over UDP. In
this instance, rather than write a single “sctp-over-udp” endpoint, you
could write a generic sctp endpoint that, if it is composed with another
endpoint, expects that endpoint to provide a datagram-style transport to it.
All of this is rather pie in the sky, and potentially the purest example
of YAGNI that it’s possible to imagine. However, it’s worth highlighting
that endpoints are inches away from being a really powerful composable tool
for saying “tunnel this protocol over this other protocol”. We may not
*want* that, but it’s an interesting thought.
Cory
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Tom Prince
2016-10-12 21:11:45 UTC
Permalink
I think if we are reconsidering the endpoint syntax, we should explicitly
have nested delimiters for quoting (so something like () or {}), to easily
allow multiple levels of nested endpoint strings (or any other kind of
string).
Glyph Lefkowitz
2016-10-12 21:19:17 UTC
Permalink
I think if we are reconsidering the endpoint syntax, we should explicitly have nested delimiters for quoting (so something like () or {}), to easily allow multiple levels of nested endpoint strings (or any other kind of string).
Do you have a suggestion or an example of how this might be used?

-g
Tom Prince
2016-10-13 02:35:22 UTC
Permalink
Post by Glyph Lefkowitz
Do you have a suggestion or an example of how this might be used?
The idea I have in my head isn't backwards compatible, but I was thinking
of something like

haproxy:(tls:hostname.example:endpoint=(tcp:7.6.5.4:443))

This would break any endpoint description that starts with `(` but allows
arbitrarily nested endpoints (or even just date containing `:`).

Thinking about it some more, there isn't currently any endpoint
descriptions that have an empty name, so we could have a entirely new
syntax that starts with `:`. If we went in that direction, we'd definitely
want to think about future extensibility when designing it.
Glyph Lefkowitz
2016-10-13 07:23:33 UTC
Permalink
Thinking about it some more, there isn't currently any endpoint descriptions that have an empty name, so we could have a entirely new syntax that starts with `:`. If we went in that direction, we'd definitely want to think about future extensibility when designing it.
This applies more generally; no need for any weird hacks. Any 'new' plugin could just opt in to a different syntax; we can just look up until the first ':'; we just need to define a new interface for a new syntax.

-glyph
Kevin Conway
2016-10-13 11:14:27 UTC
Permalink
Post by Glyph Lefkowitz
we can just look up until the first ':'; we just need to define a new
interface for a new syntax.

What do you think of adding a special argument for endpoint strings called
"wraps" or "pipe" that tells the parser to recombine the right-hand side
and send it back through the parser? For example:

haproxy:*pipe=*
ssl:port=443:privateKey=/etc/ssl/server.pem:extraCertChain=/etc/ssl/chain.pem:sslmethod=SSLv3_METHOD:dhParameters=dh_param_1024.pem:
*pipe=*tcp:7.6.5.4:443:*pipe=*unix:path=/var/run/web.sock

I believe this would fit as a backwards compatible change to the syntax and
would also position us to add composition support to the existing endpoints
in backwards compatible ways. As endpoints gain composition support,
existing users can opt-in by adding the new argument to existing string
descriptors.

This topic coming back up is timely for me. I was recently talking with a
user of the haproxy endpoint wrapper who was hitting an issue with the SSL
endpoint not playing well when used in composition. I'll spin off another
thread for that topic, but coming up with a syntax for composition is going
to be pre-requisite to having true composition support.
Post by Glyph Lefkowitz
Thinking about it some more, there isn't currently any endpoint
descriptions that have an empty name, so we could have a entirely new
syntax that starts with `:`. If we went in that direction, we'd definitely
want to think about future extensibility when designing it.
This applies more generally; no need for any weird hacks. Any 'new'
plugin could just opt in to a different syntax; we can just look up until
the first ':'; we just need to define a new interface for a new syntax.
-glyph
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Tom Prince
2016-10-13 19:47:54 UTC
Permalink
Post by Glyph Lefkowitz
This applies more generally; no need for any weird hacks. Any 'new'
plugin could just opt in to a different syntax; we can just look up until
the first ':'; we just need to define a new interface for a new syntax.

I don't think that this provides a good user experience.

1) There are existing endpoints that want nestable endpoints, so either
a) They don't change, somewhat defeating the purpose of having a new
syntax (or cluttering the endpoint namespace with less than useful
endpoints).
b) They change incompatibility, defeating the purpose of trying to
maintain backwards compatability.

2) As user, I need to learn which endpoints support the new syntax, thus
potentially needing to know both methods of quoting and switch between them
as appropriate.


There are a couple of possible ways around this, without requiring a weird
hack.
- I wonder how many endpoints strings have ever been written whose value
starts with any of `[` `(` or `{`? I suspect that the number might in fact
be 0. In which case, although the change is technically incompatible, in
practice it wouldn't be.
- Alternatively, we could deprecate an unquoted [, (, { at the beginning of
a value, and then after a suitable deprecation period (perhaps additionally
a release where it is just an error), we could repurpose one of them to act
as quoting (leaving the other two for future extensiblity).
Glyph Lefkowitz
2016-10-13 23:40:40 UTC
Permalink
Post by Tom Prince
Post by Glyph Lefkowitz
This applies more generally; no need for any weird hacks. Any 'new' plugin could just opt in to a different syntax; we can just look up until the first ':'; we just need to define a new interface for a new syntax.
I don't think that this provides a good user experience.
1) There are existing endpoints that want nestable endpoints, so either
a) They don't change, somewhat defeating the purpose of having a new syntax (or cluttering the endpoint namespace with less than useful endpoints).
We already have this problem, and we will need to do a doc cleanup / consolidation / deprecation pass soon. (see: tcp, tcp6, host, ssl, tls...)
Post by Tom Prince
b) They change incompatibility, defeating the purpose of trying to maintain backwards compatability.
As you've noticed, we may have several potential "outs" to have practically-compatible parsing syntaxes; the real problem is the internal factoring of the parsing APIs rather than the syntax.
Post by Tom Prince
2) As user, I need to learn which endpoints support the new syntax, thus potentially needing to know both methods of quoting and switch between them as appropriate.
As a user you're going to need to read the parameter documentation anyway; learning about new syntax is not much different than learning about a new parameter. And you may not realize there _is_ a syntax; most configuration of this type is just copying and pasting a reasonable-looking example. Not to say that we should be spuriously incompatible for those who have learned the rules, but the only rule to learn at this point is ": separates arguments, \ escapes :". We could add one more rule without unduly stressing the cognitive burden of the endpoint system.
Post by Tom Prince
There are a couple of possible ways around this, without requiring a weird hack.
- I wonder how many endpoints strings have ever been written whose value starts with any of `[` `(` or `{`? I suspect that the number might in fact be 0. In which case, although the change is technically incompatible, in practice it wouldn't be.
- Alternatively, we could deprecate an unquoted [, (, { at the beginning of a value, and then after a suitable deprecation period (perhaps additionally a release where it is just an error), we could repurpose one of them to act as quoting (leaving the other two for future extensiblity).
I suspect that this would be overkill here; we also have other options, like '(: :)', which would be totally compatible (there are no _arguments_ anywhere presently named "(").

-g

meejah
2016-09-06 20:43:08 UTC
Permalink
Currently there is no way to explicitly compose Twisted endpoints, but
several endpoint implementations have arisen that explicitly wrap
another endpoint, and so have needed a way to do this.
A couple other examples:

Autobahn provides 'Web Application Messaging Protocol' (WAMP) endpoints
that can use either a Unix, TCP or WebSockets protocol under the hood --
having a proper endpoint syntax for this would be nice. There is *some*
support for endpoint-strings in Autobahn using the backslash trick, but
this also results in some ugliness like:

r"autobahn:tcp\:9000\:interface\=0.0.0.0:url=ws\://localhost\:9000:compress=false"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ~~~~~~~~~~~~~~~~~~~~~

Ideally txtorcon would have a way to specify "how" to connect for
client-side connections meaning "I want a stream over the Tor whose
control-port is 'unix:/foo' that connects via SOCKS at unix:/bar to
https://meejah.ca". (This can only be done in code, currently, not
endpoint-strings) Now think about specifying that you want the above
Autobahn connection to go over Tor ;) that is, replacing
"https://meejah.ca" with the monster above...

* * *

On a possible tangent: I wonder if this also ties in with trying to wrap
protocols to "hand off" their transport to another one? Examples of this
are: SOCKS (e.g. speaking one protocol "after" another), or in Autobahn
where it's nice to listen for both "normal" Web requests and also
WebSockets requests on the same port (so there's a protocol that "peeks"
and hands off to HTTP or WebSockets handlers).

I can also imagine doing the same thing with http://magic-wormhole.io
where you would establish a connection via the wormhole mechanism, and
then pass over the established transport to the "real" protocol
(e.g. could be HTTP, WAMP, SSH, WebSockets, something custom, etc) and
this at least has an obvious need for a corresponding string-parser
syntax like the one suggested.
tls:awesome.site.example.com:443->tcp:7.6.5.4:443 A less whimsical
syntax than "->" might be better; for example, semicolons, or
something like that.
I wonder if a simple space would work? Downsides would be: requiring
quoting on shells; maybe it would present problems in config-file
use-cases; ...

So, we want "something" the highest-level parser can split on before
handing off bits (or all) of it to the actual plugins:

- can't have ":" since that's already a separatator
- can't have "=" since it's already used
- shouldn't already be in popular protocols/options (e.g. anything
valid in a URI?)

In some ways having a two-character separator could be really nice, as
it's far less likely to collide with things? I like that the intent of
"->" is also reasonably obvious, I think. Or at least hopefully looks
strange enough that you'll look it up ;)

Thanks for kicking off discussion :)
--
meejah
Loading...