Discussion:
[Twisted-Python] Request for help with Twisted bindings in M2Crypto
Matěj Cepl
2016-07-25 09:03:30 UTC
Permalink
Hello,

I took over a maintenance of (surprisingly) still quite popular M2Crypto
project in the last year. I have just released 0.25.0 which is my fifth
release during that time and I think we are slowly but surely moving
towards porting to py3k, cleaning up the code, etc.

I am now working on porting to py3k, but the biggest PITA for me (aside
from the Windows Pain™ ;)) is the Twisted integration module
(https://gitlab.com/m2crypto/m2crypto/blob/python3/M2Crypto/SSL/TwistedProtocolWrapper.py).

1) I get bugs like https://gitlab.com/m2crypto/m2crypto/issues/111 which
I have no idea how to solve, because I don't understand the deep magic
which Twisted seems to me, and I am not even sure that M2Crypto is in
fault here (not mentioning that I have a hard time to reproduce).

2) Even more pressing is that the Twisted module breaks my tests when
porting to py3k (https://travis-ci.org/mcepl/M2Crypto/jobs/146633964).
Given the opaque and complicated data types in Twisted, I see horribly
complicated task of diving into it in front of me and I am not eager.

3) Moreover, I would like to know how much interest there is in
maintaining the M2Crypto module for Twisted. I got some hope from
http://twistedmatrix.com/trac/wiki/TransportLayerSecurity which seems
like there is an interest in more complete OpenSSL bindings, but OTOH I
see on the list that Twisted now seems to use more and more of
Cryptography (why in the world somebody made such confusing name of
their project ...). Obviously the most simple way for me to be cutting
Twisted module from M2Crypto and let it be (although I am afraid I have
still some legacy users who would like to see it maintained, and given
that the legacy support is still the most important reason for
maintaining M2Crypto, I don't want to give up lightly).

Moreover, I am suspicious that for somebody who actually understands
Twisted, most of my problems are trivial and they could be solved
easily. So, before I start studying
http://krondo.com/slow-poetry-and-the-apocalypse/ (is there some better
tutorial from ground up for complete idiots?), I would like to solicit
help here for help with this module.

Would somebody raise up their hand to help me and help Twisted?

Best,

Matěj
--
https://matej.ceplovi.cz/blog/, Jabber: ***@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8

Give a man a regular expression and he’ll match a string

teach him to make his own regular expressions and you’ve got a man with
problems.
-- yakugo in http://regex.info/blog/2006-09-15/247#comment-3022
Craig Rodrigues
2016-07-25 09:33:20 UTC
Permalink
Post by Matěj Cepl
Hello,
Would somebody raise up their hand to help me and help Twisted?
Hi,

Earlier this year, I contributed lots of patches to you in M2Crypto to port
it to py3k.
Now I have shifted efforts to Twisted, where in the past month I have
contributed hundreds of patches to help improve py3k support in Twisted.

I'm not sure I have much bandwidth to help more on M2Crypto, but I'll give
advice where I can.

Twisted code is definitely not py3k clean in the parts where it interacts
with OpenSSL.

For example, if you do the following in a Python 3 virtual environment:

pip install pyOpenSSL
git clone https://github.com/twisted/twisted twisted_test
cd twisted_test
python -Wall -bb bin/trial twisted.test.test_sslverify

The tests will pass, but you will get warnings like:

twisted/internet/_sslverify.py:1648: DeprecationWarning: str for buf is no
longer accepted, use bytes
twisted/internet/_sslverify.py:1652: DeprecationWarning: str for
cipher_list is no longer accepted, use bytes
twisted/internet/_sslverify.py:1791: DeprecationWarning: str for
cipher_list is no longer accepted, use bytes

Getting correct usage of bytes vs. str is really important on py3k. I've
observed weird errors when it isn't correct.

--
Craig


--
Craig
Craig Rodrigues
2016-07-25 09:55:11 UTC
Permalink
Post by Matěj Cepl
2) Even more pressing is that the Twisted module breaks my tests when
porting to py3k (https://travis-ci.org/mcepl/M2Crypto/jobs/146633964).
Given the opaque and complicated data types in Twisted, I see horribly
complicated task of diving into it in front of me and I am not eager.
I call shenanigans on you.

Twisted is open source, so none of the data types are opaque.
Twisted is probably the best open source project I have worked with
in terms of having documentation which is generated from the code (
https://twistedmatrix.com/documents/current/api/ ).

Twisted is also absolutely *the* best project I have worked with in terms
of having unit tests with very high coverage
of the code.

If you are unfamiliar with Twisted's code and data types, and don't have
the energy to dig in,
then be honest about that, but don't accuse Twisted of being "opaque",
because it isn't.

Regarding your code example which is failing,
your code is failing because you are intermixing bytes and strings which is
a big no-no for Python 3.

If I look at this line for example:
https://gitlab.com/m2crypto/m2crypto/blob/master/M2Crypto/SSL/TwistedProtocolWrapper.py#L357

I see the code is doing stuff like:
data = ''
encryptedData = ''

Those are of type str, and need to be of type bytes:

data = b''
encryptedData = b''

You need to clean stuff like that up in your code so that you are only
using bytes.

I've really learned this lesson very hard after contributing hundreds of
py3k fixes for Twisted:

Python 2:
type(str) == type(bytes)
type(str) != type(unicode)
b"foo" == "foo"
"foo" != u"foo"

Python 3:
type(str) != type(bytes)
type(unicode) is Gone
b"foo" != "foo"
"foo" == u"foo"

There is lots of code out there which uses Python strings and bytes
interchangeably
which "works" under Python2, but breaks big time on Python 3.

--
Craig
Daniel Sank
2016-07-25 10:10:27 UTC
Permalink
I realize this is not the main point of this thread, but I'd like to make a
comment regarding Twisted being opaque.
Post by Craig Rodrigues
Twisted is open source, so none of the data types are opaque.
That's a non sequitur. A bunch of open source text in a language you don't
understand is opaque, or perhaps better called "obscure". Among other
things, Twisted's use of interfaces makes the code very hard to understand.
Post by Craig Rodrigues
Twisted is probably the best open source project I have worked with
in terms of having documentation which is generated from the code
I agree that the documentation is generally excellent and that the tests
coverage is similarly excellent. However, I still find large fractions of
the code very hard to comprehend. A while ago I made a serious effort to
understand PB and fix some bugs, but the interface stuff combined with some
very odd contortions of python class innards eventually lead me to give up.
This is despite the friendly helpful attitude of the main developers both
here and in IRC (seriously, thanks everyone for your help back then!).

I just randomly clicked through the docs to this:
https://twistedmatrix.com/documents/current/api/twisted._threads.IWorker.html

Note that:

1. It is an interface, and I still don't _really_ understand what that
means in Twisted.

2. I have no idea what a "task" is. I realize this is python and yay
duck-typing but not specifying the expected behavior of an argument seems
like a big omission.

So, while the Twisted docs are great, consider not faulting people for
being confused/daunted.
Post by Craig Rodrigues
Post by Matěj Cepl
2) Even more pressing is that the Twisted module breaks my tests when
porting to py3k (https://travis-ci.org/mcepl/M2Crypto/jobs/146633964).
Given the opaque and complicated data types in Twisted, I see horribly
complicated task of diving into it in front of me and I am not eager.
I call shenanigans on you.
Twisted is open source, so none of the data types are opaque.
Twisted is probably the best open source project I have worked with
in terms of having documentation which is generated from the code (
https://twistedmatrix.com/documents/current/api/ ).
Twisted is also absolutely *the* best project I have worked with in terms
of having unit tests with very high coverage
of the code.
If you are unfamiliar with Twisted's code and data types, and don't have
the energy to dig in,
then be honest about that, but don't accuse Twisted of being "opaque",
because it isn't.
Regarding your code example which is failing,
your code is failing because you are intermixing bytes and strings which
is a big no-no for Python 3.
https://gitlab.com/m2crypto/m2crypto/blob/master/M2Crypto/SSL/TwistedProtocolWrapper.py#L357
data = ''
encryptedData = ''
data = b''
encryptedData = b''
You need to clean stuff like that up in your code so that you are only
using bytes.
I've really learned this lesson very hard after contributing hundreds of
type(str) == type(bytes)
type(str) != type(unicode)
b"foo" == "foo"
"foo" != u"foo"
type(str) != type(bytes)
type(unicode) is Gone
b"foo" != "foo"
"foo" == u"foo"
There is lots of code out there which uses Python strings and bytes
interchangeably
which "works" under Python2, but breaks big time on Python 3.
--
Craig
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
--
Daniel Sank
Cory Benfield
2016-07-25 12:19:55 UTC
Permalink
I just randomly clicked through the docs to this: https://twistedmatrix.com/documents/current/api/twisted._threads.IWorker.html <https://twistedmatrix.com/documents/current/api/twisted._threads.IWorker.html>
1. It is an interface, and I still don't _really_ understand what that means in Twisted.
2. I have no idea what a "task" is. I realize this is python and yay duck-typing but not specifying the expected behavior of an argument seems like a big omission.
So, while the Twisted docs are great, consider not faulting people for being confused/daunted.
Well, at this point I should argue that _threads, being prefixed by an underscore, is technically a private module to Twisted. That means that, realistically, you shouldn’t really need to consult this *at all*: if anyone outside of Twisted is using IWorker then they’re taking their life into their own hands. That largely argues for part 2, though modern Twisted development practice would still require that we document types and interfaces (see https://twistedmatrix.com/documents/current/api/twisted.web._http2.H2Connection.html <https://twistedmatrix.com/documents/current/api/twisted.web._http2.H2Connection.html>, for example).

As to interfaces, that’s a separate problem. Interfaces in Twisted are documented. See https://twistedmatrix.com/documents/current/core/howto/components.html <https://twistedmatrix.com/documents/current/core/howto/components.html> for a very lengthy discussion of interfaces in Twisted. Note that Glyph has also written heavily about interfaces: https://glyph.twistedmatrix.com/2009/02/explaining-why-interfaces-are-great.html <https://glyph.twistedmatrix.com/2009/02/explaining-why-interfaces-are-great.html>. Have those documents not helped, or have you been unable to find them?

Cory
Glyph Lefkowitz
2016-07-25 23:50:29 UTC
Permalink
Post by Cory Benfield
Well, at this point I should argue that _threads, being prefixed by an underscore, is technically a private module to Twisted. That means that, realistically, you shouldn’t really need to consult this *at all*: if anyone outside of Twisted is using IWorker then they’re taking their life into their own hands. That largely argues for part 2, though modern Twisted development practice would still require that we document types and interfaces (see https://twistedmatrix.com/documents/current/api/twisted.web._http2.H2Connection.html <https://twistedmatrix.com/documents/current/api/twisted.web._http2.H2Connection.html>, for example).
However, we do still recognize that the presentation of said private documentation to end-users in apparently the same way as public documentation is a problem. You can track that here: <https://github.com/twisted/pydoctor/issues/49>. Some of the things there have been done, but we need to complete it.

-glyph
Glyph Lefkowitz
2016-07-26 06:05:17 UTC
Permalink
I realize this is not the main point of this thread, but I'd like to make a comment regarding Twisted being opaque.
Post by Craig Rodrigues
Twisted is open source, so none of the data types are opaque.
That's a non sequitur. A bunch of open source text in a language you don't understand is opaque, or perhaps better called "obscure". Among other things, Twisted's use of interfaces makes the code very hard to understand.
This is not the first time that someone has made this comment, and I find it very discouraging.

An interface is a very simple concept - an abstract description of what an object is expected to provide if you're going to do something useful with it. As a specific programmatic instantiation of this concept, zope.interface provides us with two chief advantages, one purely informational and one useful at run-time. The informational one is that rather than saying "this parameter must provide four methods, the first of which is makeConnection, the second of which is (optionally) connectionMade, the third of which is dataReceived, the fourth of which is connectionLost, which take parameters, respectively,..." we can say "@type: L{IProtocol} provider". The programmatic advantage is that we can ask the question directly; not 'do you have attributes with these names' or 'does this function signature match', but IProtocol.providedBy(something), which asks if 'something' even _intends_ to implement all the relevant functionality described by IProtocol.

In my view, interface definition is the primary activity of software development; deciding how the pieces of the system fit together and precisely describing what they do. The fact that so many people seem to find either the basic idea of an abstract type, or the concrete instantiation of that idea in the Zope Interface library, so horribly confusing, makes me despair of ever communicating the actually hard stuff that Twisted gets up to in its internals.

The main thing that I have heard in the past about what makes interfaces confusing is that people want to trace through the implementation to see what twisted is 'really doing', and the fact that there is -more than one- implementation of a method like 'listenTCP' is the source of the real confusion, underlying the problem with "interfaces". But, that's the whole point: 'listenTCP' is an abstract concept. The better parts of twisted are _more_ abstract, like 'IStreamServerEndpoint.listen', which has dozens of implementations rather than just 3 or 4, or Deferred.callback, which might do anything at all - if the callback chain did one specific thing there would hardly be any point.

So, this is more just an emotional appeal from me, than anything the project needs, but I would very much like to understand *what* is so confusing about "interfaces". Is it, as Cory posited, just that the documentation is not properly linked? Or is it that the average Python developer needs a gentle introduction to the entire idea of abstract rather than concrete types? If they do - is it really Twisted's responsibility to provide it to them? Should Zope Interface just have a snazzier website?
Post by Craig Rodrigues
Twisted is probably the best open source project I have worked with
in terms of having documentation which is generated from the code
I agree that the documentation is generally excellent and that the tests coverage is similarly excellent. However, I still find large fractions of the code very hard to comprehend. A while ago I made a serious effort to understand PB and fix some bugs, but the interface stuff combined with some very odd contortions of python class innards eventually lead me to give up. This is despite the friendly helpful attitude of the main developers both here and in IRC (seriously, thanks everyone for your help back then!).
Given that Twisted is often translating network protocol data into Python method calls, one needs both a working domain knowledge of the protocol involved and a robust understanding of Python metaprogramming constructs. It sounds here like where you fell down was mostly in the "Python metaprogramming" area, where PB is especially intense. Unfortunately, framework code just looks like that (the inner guts of Django are similarly, if not more, obscure, for example); it's not especially specific to Twisted.

Don't get me wrong, Python metaprogramming _is_ hard. It's something that I know pretty well, but I can recognize that each additional layer of indirection is additional complexity for someone to learn, and metaprogramming is by definition at least 3 layers indirected from your actual problem. But it's also not really specific to Twisted either. This is another case where I'm not sure what to do except to refer people to the language reference and tell them to work through it slowly.
I just randomly clicked through the docs to this: https://twistedmatrix.com/documents/current/api/twisted._threads.IWorker.html <https://twistedmatrix.com/documents/current/api/twisted._threads.IWorker.html>
1. It is an interface, and I still don't _really_ understand what that means in Twisted.
The interface stuff in Twisted is a wholly separate library, 'zope.interface'. You can read its documentation - both narrative tutorials and API documentation - here:

https://docs.zope.org/zope.interface/README.html

This is referenced by the Twisted documentation which Cory referred to earlier, but I think it would be helpful to call that out specifically - you can read about interfaces completely separate from Twisted. They don't mean anything special within Twisted itself (beyond our addition of registerAdapter, which is used less and less often in modern Twisted apps).
2. I have no idea what a "task" is. I realize this is python and yay duck-typing but not specifying the expected behavior of an argument seems like a big omission.
Did you miss the part where it said "type: 0-argument callable" in the documentation? The expected behavior of the argument is that it is a thing that can be called, and it takes 0 arguments. Its return value is unspecified because IWorker providers aren't allowed to use its return value.
So, while the Twisted docs are great, consider not faulting people for being confused/daunted.
I don't want to fault people, and there are many issues we can address within Twisted's documentation. For example, culling things from the API documentation and presenting it as it is really intended to be consumed, eliding private methods that have underscores or live in test_* packages to present a smaller surface, improving tutorial documentation, and so on. It's not entirely the user's fault.

But certain things - the fact that we might have an abstract interface with multiple concrete implementations, for example - are inherent parts of the problem domain that Twisted is trying to address, and this is what Interfaces let us express. When people have said things like "Twisted's use of interfaces makes

-glyph
Daniel Sank
2016-07-26 07:26:44 UTC
Permalink
Glyph,
Post by Glyph Lefkowitz
Post by Daniel Sank
2. I have no idea what a "task" is. I realize this is python and yay
duck-typing but not
Post by Glyph Lefkowitz
Post by Daniel Sank
specifying the expected behavior of an argument seems like a big
omission.
Post by Glyph Lefkowitz
Did you miss the part where it said "type: 0-argument callable" in the
documentation?

Yes :(

I can only guess that I missed it because the type is defined near the end
of the description line, whereas I'm used to seeing

def foo(x, y):
"""Do something.

Args:
x (int): blah blah
y (banana): yadda yadda
"""

FWIW, now that I look at the code, the type specification is way more
visually apparent there than it is in the generated HTML.

tl,dr: I take it all back and thanks for pointing out the obvious.
Post by Glyph Lefkowitz
An interface is a very simple concept - an abstract description of what
an object is expected
Post by Glyph Lefkowitz
to provide if you're going to do something useful with it.
Indeed, a general understanding of interfaces is not the problem.
Post by Glyph Lefkowitz
In my view, interface definition is the primary activity of software
development

Agreed 100%.
Post by Glyph Lefkowitz
The fact that so many people seem to find either the basic idea of an
abstract type, or the concrete
Post by Glyph Lefkowitz
instantiation of that idea in the Zope Interface library, so horribly
confusing, makes me despair of
Post by Glyph Lefkowitz
ever communicating the actually *hard* stuff that Twisted gets up to in
its internals.
Post by Glyph Lefkowitz
I would very much like to understand *what* is so confusing about
"interfaces". Is it, as Cory posited,
Post by Glyph Lefkowitz
just that the documentation is not properly linked? Or is it that the
average Python developer needs
Post by Glyph Lefkowitz
a gentle introduction to the entire idea of abstract rather than concrete
types? If they do - is it really
Post by Glyph Lefkowitz
Twisted's responsibility to provide it to them? Should Zope Interface
just have a snazzier website?

Some years ago when I tried to understand Twisted's use of interfaces via
Twisted's own documentation (which included something about hair dryers and
voltage standards) I was puzzled by the fact that the examples didn't
really show me how to solve a useful problem (or I was too stupid to
understand that the examples did in fact do that) *despite the fact that I
knew what an interface was in general terms*. It was a case of
understanding the intent but none of the examples. A brief look at the zope
documentation just now makes me think the situation has improved.

The other problem was that interfaces were sprinkled somewhat haphazardly
around the code I was trying to understand (perspective broker) and it was
just plain hard to keep navigating around the code files to understand who
was implementing what interfaces. This could have been my own fault for not
having a editor set up. I don't know.
Post by Glyph Lefkowitz
Should Zope Interface just have a snazzier website?
I think the real issue is the need for compelling and simple examples.

- Daniel


P.S. Everything below here is completely off topic of this thread and I
probably shouldn't have written it.
Post by Glyph Lefkowitz
Given that Twisted is often translating network protocol data into Python
method calls, one needs both
Post by Glyph Lefkowitz
a working domain knowledge of the protocol involved and a robust
understanding of Python
Post by Glyph Lefkowitz
metaprogramming constructs. It sounds here like where you fell down was
mostly in the "Python
Post by Glyph Lefkowitz
metaprogramming" area, where PB is especially intense.
But it's also not really specific to Twisted either. This is another
case where I'm not sure what to do
Post by Glyph Lefkowitz
except to refer people to the language reference and tell them to work
through it slowly.
Post by Glyph Lefkowitz
Unfortunately, framework code just looks like that
I spent a considerable amount of time reading the PB code, reproducing
parts of it myself, and talking to people on IRC and the mailing list to
understand a particularly weird issue in PB. See here for the bug I was
trying to fix (note in particular my first comment to the one existing
answer):

http://stackoverflow.com/questions/23421423/why-are-dummy-objects-created-in-twisteds-pb-system

I distinctly recall that near the end of my efforts you (Glyph) or someone
else more or less told me that the PB code was old, horrible, and that the
issues I was trying to understand were probably incidental complexity due
to poor design etc. You guys were joking around on IRC about how ridiculous
all the dummy object construction is. So, I think this *particular*
incident was
less due a lack of understanding of python metaprogramming and more due to
PB having some bizarre warts.
Glyph Lefkowitz
2016-07-26 08:18:41 UTC
Permalink
Post by Daniel Sank
Glyph,
Post by Glyph Lefkowitz
2. I have no idea what a "task" is. I realize this is python and yay duck-typing but not
specifying the expected behavior of an argument seems like a big omission.
Did you miss the part where it said "type: 0-argument callable" in the documentation?
Yes :(
I can only guess that I missed it because the type is defined near the end of the description line, whereas I'm used to seeing
"""Do something.
x (int): blah blah
y (banana): yadda yadda
"""
FWIW, now that I look at the code, the type specification is way more visually apparent there than it is in the generated HTML.
tl,dr: I take it all back and thanks for pointing out the obvious.
This does at least point to a real problem with pydoctor in the way it presents types. It should probably put them in their own colored box, not use the string 'type' or parentheses to offset them, and put the type closer to (rather than farther from) the parameter name. Would you mind filing a bug on pydoctor? Or commenting on one if it already exists? :)
Post by Daniel Sank
Post by Glyph Lefkowitz
An interface is a very simple concept - an abstract description of what an object is expected
to provide if you're going to do something useful with it.
Indeed, a general understanding of interfaces is not the problem.
Post by Glyph Lefkowitz
In my view, interface definition is the primary activity of software development
Agreed 100%.
OK. Glad to hear it.
Post by Daniel Sank
Post by Glyph Lefkowitz
The fact that so many people seem to find either the basic idea of an abstract type, or the concrete
instantiation of that idea in the Zope Interface library, so horribly confusing, makes me despair of
ever communicating the actually hard stuff that Twisted gets up to in its internals.
I would very much like to understand *what* is so confusing about "interfaces". Is it, as Cory posited,
just that the documentation is not properly linked? Or is it that the average Python developer needs
a gentle introduction to the entire idea of abstract rather than concrete types? If they do - is it really
Twisted's responsibility to provide it to them? Should Zope Interface just have a snazzier website?
Some years ago when I tried to understand Twisted's use of interfaces via Twisted's own documentation (which included something about hair dryers and voltage standards) I was puzzled by the fact that the examples didn't really show me how to solve a useful problem (or I was too stupid to understand that the examples did in fact do that) despite the fact that I knew what an interface was in general terms. It was a case of understanding the intent but none of the examples.
OK... it's a fair cop. That documentation is not the best. Among other things, it's mainly trying to explain adaptation, which sort of puts the cart before the horse, and automatic adaptation is increasingly considered spooky action-at-a-distance within Twisted code. You can see it here: <http://twisted.readthedocs.io/en/latest/core/howto/components.html <http://twisted.readthedocs.io/en/latest/core/howto/components.html>>.

You're the perfect person to submit patches against this doc, by the way, since you have a firm grasp of the whole "abstract interface" thing but also found it confusing. Personally, I find the examples very clear - I say the documentation is "not the best" because I could see how it could confuse somebody _else_, but it doesn't confuse _me_ at all, so it's a bit hard for me to improve it (especially incrementally).
Post by Daniel Sank
A brief look at the zope documentation just now makes me think the situation has improved.
Well that's good, at least. perhaps we should link to it more prominently.
Post by Daniel Sank
The other problem was that interfaces were sprinkled somewhat haphazardly around the code I was trying to understand (perspective broker) and it was just plain hard to keep navigating around the code files to understand who was implementing what interfaces. This could have been my own fault for not having a editor set up. I don't know.
Setting up your editor to have a 'jump to definition' key definitely helps; but then, it generally helps with any large codebase. So, hard to say.
Post by Daniel Sank
Post by Glyph Lefkowitz
Should Zope Interface just have a snazzier website?
I think the real issue is the need for compelling and simple examples.
Do you think it would be better to put things in terms of a concrete Twisted interface, like "IProtocol"? I am pretty sure these docs were trying to stay away from anything "real" because this is a highly abstract concept that could apply to anything, and when we drag a concrete example in
Post by Daniel Sank
P.S. Everything below here is completely off topic of this thread and I probably shouldn't have written it.
In for a penny...

(We may want to spin out into a different thread for talking about the PB issue, but it looks like this was unresolved for you, and I can definitely shed some more light.)
Post by Daniel Sank
Post by Glyph Lefkowitz
Given that Twisted is often translating network protocol data into Python method calls, one needs both
a working domain knowledge of the protocol involved and a robust understanding of Python
metaprogramming constructs. It sounds here like where you fell down was mostly in the "Python
metaprogramming" area, where PB is especially intense.
But it's also not really specific to Twisted either. This is another case where I'm not sure what to do
except to refer people to the language reference and tell them to work through it slowly.
Unfortunately, framework code just looks like that
http://stackoverflow.com/questions/23421423/why-are-dummy-objects-created-in-twisteds-pb-system <http://stackoverflow.com/questions/23421423/why-are-dummy-objects-created-in-twisteds-pb-system>
I distinctly recall that near the end of my efforts you (Glyph) or someone else more or less told me that the PB code was old, horrible, and that the issues I was trying to understand were probably incidental complexity due to poor design etc. You guys were joking around on IRC about how ridiculous all the dummy object construction is. So, I think this particular incident was less due a lack of understanding of python metaprogramming and more due to PB having some bizarre warts.
OK. Maybe I made this sound a bit too simple, but it's still not really Twisted's fault. The bizarre warts here - and they are definitely here - are mostly an outgrowth of the bizarre mismatch between old-style and new-style classes, and the mad shuffle of random API deprecations, often without suitable replacements, or without suitable portable replacements, within the standard library.

In the old-style world, you had Class objects. Class objects could be created in a variety of ways, but the Right™ way to make a new, empty class that hadn't had its initializer run was 'new.instance'. Of course, new.instance doesn't work with new-style classes, because now the Right™ way to make a new, empty class that hadn't had its initializer run was yourclass.__new__(). Unless of course you overrode __new__, which is totally allowed, so the caller can't know what that signature is supposed to be. So then you use `object.__new__(yourclass)ÂŽ in order to get a known signature - but of course that won't work at all with old-style classes. PB is bridging the gap between these two worlds, and it has to find hacks which work in both and don't draw in any deprecated APIs that are gone in python 3 in order to do it. This means you can't use the Right™ way at all, and instead must resort to contortions which depend on implementation details that happen to be held in common between both new-style and old-style objects.

-glyph
Glyph Lefkowitz
2016-07-26 10:09:41 UTC
Permalink
Post by Daniel Sank
http://stackoverflow.com/questions/23421423/why-are-dummy-objects-created-in-twisteds-pb-system <http://stackoverflow.com/questions/23421423/why-are-dummy-objects-created-in-twisteds-pb-system>
BTW, since this discussion raised this question again, and since I now understand better what I think you were _actually_ asking, I put a new answer on that question. Hopefully it resolves the mystery for you :).

-glyph
Matěj Cepl
2016-07-25 23:02:48 UTC
Permalink
Post by Craig Rodrigues
Earlier this year, I contributed lots of patches to you in M2Crypto to port
it to py3k.
Now I have shifted efforts to Twisted, where in the past month I have
contributed hundreds of patches to help improve py3k support in Twisted.
Hi,

can I for a piece of advice on the documentation.
M2Crypto.SSL.TwistedProtocolWrapper.TLSProtocolWrapper.startTLS
implements ITLSTransport.startTLS interace method, which first parameter
is called ``ctx`` and in `the current implementation in M2Crypto`_ it is of
type control of SSL Context (that's M2Crypto.SSL.Context.Context) or
whether it is factory generating such Contexts (which is what
https://twistedmatrix.com/documents/current/api/twisted.internet._newtls.ConnectionMixin.html
says, at least I understand it in this way). Is my current
implementation wrong?

Thank you for any answer in advance,

Matìj

.. _`the current implementation in M2Crypto`:
https://gitlab.com/m2crypto/m2crypto/blob/python3/M2Crypto/SSL/TwistedProtocolWrapper.py#L231
--
https://matej.ceplovi.cz/blog/, Jabber: ***@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8

To err is human, to purr feline.
Glyph Lefkowitz
2016-07-26 06:28:23 UTC
Permalink
Post by Matěj Cepl
Post by Craig Rodrigues
Earlier this year, I contributed lots of patches to you in M2Crypto to port
it to py3k.
Now I have shifted efforts to Twisted, where in the past month I have
contributed hundreds of patches to help improve py3k support in Twisted.
Hi,
can I for a piece of advice on the documentation.
M2Crypto.SSL.TwistedProtocolWrapper.TLSProtocolWrapper.startTLS
implements ITLSTransport.startTLS interace method, which first parameter
is called ``ctx`` and in `the current implementation in M2Crypto`_ it is of
type control of SSL Context (that's M2Crypto.SSL.Context.Context) or
whether it is factory generating such Contexts (which is what
https://twistedmatrix.com/documents/current/api/twisted.internet._newtls.ConnectionMixin.html
says, at least I understand it in this way). Is my current
implementation wrong?
Technically speaking, your implementation is wrong because it claims to implement <https://twistedmatrix.com/documents/16.3.0/api/twisted.internet.interfaces.ITLSTransport.html>, which documents the method startTLS <https://twistedmatrix.com/documents/16.3.0/api/twisted.internet.interfaces.ITLSTransport.html#startTLS> to accept a `contextFactory´ which is a provider of either <https://twistedmatrix.com/documents/16.3.0/api/twisted.internet.interfaces.IOpenSSLClientConnectionCreator.html> or <https://twistedmatrix.com/documents/16.3.0/api/twisted.internet.interfaces.IOpenSSLServerConnectionCreator.html>. Both of these interfaces return pyOpenSSL-specific objects. If you want to do TLS with M2Crypto, you must therefore give up on supporting any of Twisted's interfaces directly, because (for example) optionsForClientTLS <https://twistedmatrix.com/documents/16.3.0/api/twisted.internet.ssl.html#optionsForClientTLS> is not going to work with your implementation, since you don't call clientConnectionForTLS on it.

If you want to provide TLS purely with M2Crypto, then you should have interfaces which describe exactly how it should work with M2Crypto. You can make it take a concrete context if you want, or a factory, whichever makes sense for how you're going to set it up. Personally my recommendation would be to go in the direction that Twisted itself has been moving and pass a thing that can create Connection objects (or, in OpenSSL-speak, an "SSL*", or in M2Crypto, an _SSLProxy(ssl_new())). No existing Twisted code which is going to call startTLS() can be made to work with these interfaces without extensive monkey-patching, and even then, anything which expects to be able to cut in at the OpenSSL layer will break.

Basically, Twisted doesn't have a mechanism for abstracting away the TLS backend yet. I'd really like it if it did! If you want M2Crypto to be able to do what it's currently trying to do, you could contribute code to Twisted to make things like optionsForClientTLS more abstract, and to isolate the TLS implementation more closely to the TLS wrapper factory. This would make it easier to adopt Cryptography's TLS API eventually, which is what we'll need to do as pyOpenSSL eventually becomes less relevant.

This is still several years away, of course. But it would be nice to have some help getting there in advance.

-glyph
Glyph Lefkowitz
2016-07-26 07:41:08 UTC
Permalink
Post by Matěj Cepl
3) Moreover, I would like to know how much interest there is in
maintaining the M2Crypto module for Twisted. I got some hope from
http://twistedmatrix.com/trac/wiki/TransportLayerSecurity <http://twistedmatrix.com/trac/wiki/TransportLayerSecurity> which seems
like there is an interest in more complete OpenSSL bindings,
That is a very old wiki page. I will delete it to avoid confusing people in the future! Not only has pyOpenSSL had complete enough bindings to implement the feature described on that wiki page for several years now (a Twisted developer, Jean-Paul Calderone, actually took over maintenance of pyOpenSSL expressly for the purpose of adding those APIs), we actually implemented the TLS implementation based on those APIs <https://github.com/twisted/twisted/blob/trunk/twisted/internet/_newtls.py <https://github.com/twisted/twisted/blob/trunk/twisted/internet/_newtls.py>> in 2011, and fully finished transitioning to that new API in 2014 <https://github.com/twisted/twisted/commit/ee2070fe9e3f539ff702d9ff133aafa33ea19ac5> when we deleted the older, SSLSocket based API.
Post by Matěj Cepl
but OTOH I see on the list that Twisted now seems to use more and more of
Cryptography (why in the world somebody made such confusing name of
their project ...).
The choice of name is intentional: it is designed to convey a sense of authoritativeness. I.e. if you need cryptography in Python, you should 'import cryptography', and ignore everything else. The Cryptography project specifically calls out M2Crypto, PyCrypto, and PyOpenSSL as having problems and lacking maintenance: <https://cryptography.io/en/latest/#why-a-new-crypto-library-for-python <https://cryptography.io/en/latest/#why-a-new-crypto-library-for-python>>. pyOpenSSL is now just a thin wrapper over Cryptography itself, and PyCrypto's maintainer now generally suggests Cryptography <https://github.com/dlitz/pycrypto/issues/158#issuecomment-140833926 <https://github.com/dlitz/pycrypto/issues/158#issuecomment-140833926>>. So it is 2/3 of the way to achieving its goal of eliminating these libraries which duplicate so much effort - M2Crypto is all that remains :).
Post by Matěj Cepl
Obviously the most simple way for me to be cutting
Twisted module from M2Crypto and let it be (although I am afraid I have
still some legacy users who would like to see it maintained, and given
that the legacy support is still the most important reason for
maintaining M2Crypto, I don't want to give up lightly).
I do not want to denigrate the work you've done maintaining a legacy library. I think it's noble to take on this kind of work. But if you don't have any particular reason for needing to maintain this library beyond "legacy support", and it is not different from Cryptography in any meaningful way, the best thing that you could do for its existing users would be to do the same thing that was done with pyOpenSSL: make it a thin wrapper over the bindings layer in Cryptography, get rid of all of the SWIG code in M2Crypto, and start gently directing users in the direction of Cryptography for any new code. This would get everyone onto a supported base platform for their security primitives, allow them to share code with other parts of large systems that already use Cryptography or pyOpenSSL, and provide a well-maintained path forward.

M2Crypto's main claim to superiority in past years was its higher degree of completeness of OpenSSL APIs, but Cryptography has since far surpassed it. Even if there are some APIs that Cryptography's bindings layer doesn't export, it's easier and safer to add more bindings there than in M2Crypto.

Even if you're not going to get rid of M2Crypto entirely, M2Crypto's implementation of Twisted TLS copies the terrible no-security defaults it inherits from OpenSSL, the same ones that Twisted had 5 years ago, and Twisted has moved on to have actual security (certificate verification, service identity, and trust root configuration). We also continue to improve that security regularly. Given all that, this is a rare case where I would not mind seeing Twisted support disappear from a library. Unless I were to get some new information I would have to very strongly discourage anyone who wanted to use the SSL backend in M2Crypto. I realize you have users, but possibly the best thing you could do for their own good would be to force them to move to Twisted's much better native TLS support, which thankfully is easy to adopt at this point.

-glyph
Matěj Cepl
2016-07-26 15:43:58 UTC
Permalink
Post by Glyph Lefkowitz
I do not want to denigrate the work you've done maintaining
a legacy library. I think it's noble to take on this kind of
work.
I was watching “A Special Day” (1977) yesterday so I have
somewhat lesser tolerance for the pompous superiority complex,
but I will think about your reply and the support for Twisted
will probably go.

Matěj
--
https://matej.ceplovi.cz/blog/, Jabber: ***@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8

http://xkcd.com/743/ 
 enough said.
Glyph Lefkowitz
2016-07-26 17:55:12 UTC
Permalink
Post by Glyph Lefkowitz
I do not want to denigrate the work you've done maintaining a legacy library. I think it's noble to take on this kind of work.
I was watching “A Special Day” (1977) yesterday so I have somewhat lesser tolerance for the pompous superiority complex,
Despite your quite rude introduction to the mailing list (twisted is "opaque and complicated", "deep magic", learning it is "horribly complicated"), several people (including myself) invested quite a bit of time to try to answer your questions in detail. And, for taking the trouble to be diplomatic, you have now, as I understand it, made a veiled allegation that I'm a fascist? If you are not _trying_ to be heinously offensive, perhaps you should stop posting here.

-glyph

Matěj Cepl
2016-07-26 15:45:25 UTC
Permalink
Post by Glyph Lefkowitz
An interface is a very simple concept
Actually I found
https://twistedmatrix.com/documents/current/core/howto/components.html
to be a very good description. Yes, the concept is not that
complicated, but it is very uncommon in the Pythonic world and
the experience with M2Crypto and your previous reply seems to
indicate that even you in the end prefer hard-wiring
Cryptography to Twisted instead of using your own interfaces.
Isn’t simple better than complex?

Whatever,

Matěj
--
https://matej.ceplovi.cz/blog/, Jabber: ***@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8

http://xkcd.com/743/ 
 enough said.
Matěj Cepl
2016-07-26 15:47:10 UTC
Permalink
On 2016-07-25, 09:55 GMT, Craig Rodrigues wrote:
First of all, thank you very much for all help you gave to
M2Crypto. However, ...
Post by Craig Rodrigues
I call shenanigans on you.
Nothing of which I have been accused of has anything to do with
what I meant. When I said „opaque“, I didn’t mean to offend
Twisted. Just to say, that while I was marking whole library
with PEP-484 type hints, I usually dealt with bytes, str, ints,
and very few rather simple objects. With Twisted I get object

@implementer(ITLSTransport)
class TLSProtocolWrapper(ProtocolWrapper):
def __init__(self, factory, wrappedProtocol,
startPassThrough, client,
contextFactory, postConnectionCheck):
# type: policies.WrappingFactory, object, int, int, object, Checker

(and with those two objects, I don't even dare to guess what
types these are, and all that covered in some weird decorator
from Zope (?)).

That's nothing wrong with Twisted, just that it is really
difficult for idiot like me to understand what's going on.
Post by Craig Rodrigues
If you are unfamiliar with Twisted's code and data types, and
don't have the energy to dig in, then be honest about that,
but don't accuse Twisted of being "opaque", because it isn't.
I don't think what's dishonest on saying that Twisted API is
quite complicated and I am stupid enough not understanding
what's going on.
Post by Craig Rodrigues
Regarding your code example which is failing,
your code is failing because you are intermixing bytes and strings which is
a big no-no for Python 3.
Of course I know that (it is not the first place where I have to
deal with bytes × str dichotomy in py3k), but in order to
understand what's going, I have to first understand where these
values come from and where they run to, i.e., to decipher
Twisted. Thus I was asking for help.
Post by Craig Rodrigues
https://gitlab.com/m2crypto/m2crypto/blob/master/M2Crypto/SSL/TwistedProtocolWrapper.py#L357
data = ''
encryptedData = ''
data = b''
encryptedData = b''
You need to clean stuff like that up in your code so that you are only
using bytes.
I believe I have fixed all I can do without actually
understanding Twisted in
https://gitlab.com/mcepl/m2crypto/commit/6cd5f87b31e50016ebb7e44f3f2ae46610bc24e0.
So now, if Twisted is so transparent and perfectly
understandable, could you please suggest, what I do wrong, that
the test ends in the endless loop
(https://travis-ci.org/mcepl/M2Crypto/builds/147175901)?

Thank you,

Matěj Cepl
--
https://matej.ceplovi.cz/blog/, Jabber: ***@ceplovi.cz
GPG Finger: 3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8

Never ascribe to malice that which is adequately explained by
stupidity.
-- Napoleon Bonaparte (or many other people to whom this
quote is ascribed)
Loading...