Discussion:
[Twisted-Python] Problems with inlineCallback, Deferred, yield and Python 3 in buildbot
Craig Rodrigues
2017-01-22 02:15:45 UTC
Permalink
Hi,

I have been submitting many patches to get buildbot working on Python 3:

http://bit.ly/2jCCMPW

I have run into one problem involving inlineCallback, Deferred, and yield
which I am having difficulty solving. Can someone help me?

If I do the following inside a Python 3 virtualenv to set things up:

git clone https://github.com/buildbot/buildbot buildbot_test
cd buildbot_test
pip install -e pkg
pip install -e worker
pip install -e 'master[tls,tests]'

trial buildbot.test.unit.test_process_buildrequestdistributor
I get this error:

==================================================================
File
"/Users/crodrigues/buildbot_test/master/buildbot/process/buildrequestdistributor.py",
line 93, in <lambda>^M
brdicts.sort(key=lambda brd: brd['submitted_at'])
builtins.TypeError: 'Deferred' object is not subscriptable

buildbot.test.unit.test_process_buildrequestdistributor.TestMaybeStartBuilds.test_slow_db^M
==================================================================


The code causing this problem is on line 93 of
master/buildbot/process/buildrequestdistributor.py:

78 @defer.inlineCallbacks
79 def _fetchUnclaimedBrdicts(self):
80 # Sets up a cache of all the unclaimed brdicts. The cache is
81 # saved at self.unclaimedBrdicts cache. If the cache already
82 # exists, this function does nothing. If a refetch is desired,
set
83 # the self.unclaimedBrdicts to None before calling."""
84 if self.unclaimedBrdicts is None:
85 # TODO: use order of the DATA API
86 brdicts = yield self.master.data.get(('builders',
87 (yield
self.bldr.getBuilderId()),
88 'buildrequests'),
89
[resultspec.Filter('claimed',
90
'eq',
91
[False])])
92 # sort by submitted_at, so the first is the oldest
93 brdicts.sort(key=lambda brd: brd['submitted_at'])
94 self.unclaimedBrdicts = brdicts
95 defer.returnValue(self.unclaimedBrdicts)

If I run the test on Python 2, I don't get the error, and on line 93,
brdicts is a dict.
However, if I run the test on Python 3, brdicts is a Deferred, thus the
error.

Can someone point me in the right direction for solving this problem?

I am not so familiar with the differences in generators and Deferreds
between Python 2 and 3.

Thanks.

--
Craig
Glyph Lefkowitz
2017-01-22 04:06:19 UTC
Permalink
If I run the test on Python 2, I don't get the error, and on line 93, brdicts is a dict.
However, if I run the test on Python 3, brdicts is a Deferred, thus the error.
This really ought to be impossible. If I were seeing this, single-stepping through inlineCallbacks in pudb would probably be my next step. Have you made any attempts to simplify it down to remove some of the buildbot-specific code?

-glyph
Craig Rodrigues
2017-01-24 00:08:57 UTC
Permalink
If I run the test on Python 2, I don't get the error, and on line 93, brdicts is a dict.
However, if I run the test on Python 3, brdicts is a Deferred, thus the error.
This really ought to be impossible. If I were seeing this,
single-stepping through inlineCallbacks in pudb would probably be my next
step. Have you made any attempts to simplify it down to remove some of the
buildbot-specific code?
-glyph
I did some more debugging. The callstack is quite deep. :/
I used this command to trigger the error:

trial
buildbot.test.unit.test_process_buildrequestdistributor.TestMaybeStartBuilds.test_slow_db

I found in this function in buildbot's BuildRequestsEndpoint.get() function
(
https://github.com/buildbot/buildbot/blob/master/master/buildbot/data/buildrequests.py#L146
) there is this line:



* defer.returnValue( [(yield self.db2data(br)) for br in
buildrequests])*
On Python 2, this line returns:
an object list
each list entry is a dictionary of name/value pairs that looks like:

[{'buildrequestid': 10, 'complete': False, 'waited_for': False,
'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid': 11,
'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 12, 6,
40, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
'priority': 0}, {'buildrequestid': 11, 'complete': False, 'waited_for':
False, 'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid':
11, 'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 13,
30, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
'priority': 0}]

On Python 3, this returns:
an object <generator object BuildRequestsEndpoint.get.<locals>.<listcomp>
of type <class 'generator'>

The self.db2data() function returns defer.succeed(data) (
https://github.com/buildbot/buildbot/blob/master/master/buildbot/data/buildrequests.py#L32
)

So it looks like on Python 3, this is not getting evaluated properly, and
the upper layer is trying
to index a Deferred instead of a list of dictionaries.

I'm not so familiar with generators/Deferreds in Py2 vs. Py3. What is the
best way
to fix this to get consistent behavior on Py2 and Py3?

Thanks.
--
Craig
Jean-Paul Calderone
2017-01-24 01:10:56 UTC
Permalink
Post by Craig Rodrigues
If I run the test on Python 2, I don't get the error, and on line 93, brdicts is a dict.
However, if I run the test on Python 3, brdicts is a Deferred, thus the error.
This really ought to be impossible. If I were seeing this,
single-stepping through inlineCallbacks in pudb would probably be my next
step. Have you made any attempts to simplify it down to remove some of the
buildbot-specific code?
-glyph
I did some more debugging. The callstack is quite deep. :/
trial buildbot.test.unit.test_process_buildrequestdistributor.
TestMaybeStartBuilds.test_slow_db
I found in this function in buildbot's BuildRequestsEndpoint.get()
function ( https://github.com/buildbot/buildbot/blob/master/master/
* defer.returnValue( [(yield self.db2data(br)) for br in
buildrequests])*
an object list
[{'buildrequestid': 10, 'complete': False, 'waited_for': False,
'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid': 11,
'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 12, 6,
40, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
11, 'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 13,
30, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
'priority': 0}]
an object <generator object BuildRequestsEndpoint.get.<
locals>.<listcomp>
of type <class 'generator'>
Yep.

On Python 2, [(yield self.db2data(br)) for br in buildrequests] is a list
comprehension. It will have len(buildrequests) elements and each will be
the value sent back in to the generator via the yield expression.

On Python 3, the same expression is a list of one element which is a
generator expression.

This form is much clearer, I think, and behaves as intended on both
versions of Python:

results = []
for br in buildrequests:
results.append((yield self.db2data(br)))
defer.returnValue(results)

Jean-Paul
Craig Rodrigues
2017-01-24 20:48:07 UTC
Permalink
On Mon, Jan 23, 2017 at 5:10 PM, Jean-Paul Calderone <
Post by Jean-Paul Calderone
Post by Craig Rodrigues
I did some more debugging. The callstack is quite deep. :/
trial buildbot.test.unit.test_process_buildrequestdistributor.Tes
tMaybeStartBuilds.test_slow_db
I found in this function in buildbot's BuildRequestsEndpoint.get()
function ( https://github.com/buildbot/buildbot/blob/master/master/buil
* defer.returnValue( [(yield self.db2data(br)) for br
in buildrequests])*
an object list
[{'buildrequestid': 10, 'complete': False, 'waited_for': False,
'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid': 11,
'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 12, 6,
40, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
11, 'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 13,
30, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
'priority': 0}]
an object <generator object BuildRequestsEndpoint.get.<loc
als>.<listcomp>
of type <class 'generator'>
Yep.
On Python 2, [(yield self.db2data(br)) for br in buildrequests] is a list
comprehension. It will have len(buildrequests) elements and each will be
the value sent back in to the generator via the yield expression.
On Python 3, the same expression is a list of one element which is a
generator expression.
This form is much clearer, I think, and behaves as intended on both
results = []
results.append((yield self.db2data(br)))
defer.returnValue(results)
Wow, thanks! That fixed it. I submitted those fixes upstream to buildbot.

I'm not so familiar with generator expressions and Deferred's so have been
playing
around with things to try to understand.

I went back to the problem code, and changed:


* defer.returnValue( [(yield self.db2data(br)) for br in
buildrequests])*

to:

* defer.returnValue(*
* list([**(yield self.db2data(br)) for br in buildrequests])*

and ran the code under Python 3 and got a return value of:

[ Deferred ]

instead of:
[ {......} ]

where the Deferred.results value was the dictionary.

How does Python 2 directly go to returning the dictionary out of the
Deferred,
while Python 3 returns the Deferred inside the list? self.db2data()
returns a Deferred.

Thanks.

--
Craig
Glyph Lefkowitz
2017-01-24 21:14:32 UTC
Permalink
Post by Craig Rodrigues
I did some more debugging. The callstack is quite deep. :/
trial buildbot.test.unit.test_process_buildrequestdistributor.TestMaybeStartBuilds.test_slow_db
defer.returnValue(
[(yield self.db2data(br)) for br in buildrequests])
an object list
[{'buildrequestid': 10, 'complete': False, 'waited_for': False, 'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid': 11, 'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 12, 6, 40, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None, 'priority': 0}, {'buildrequestid': 11, 'complete': False, 'waited_for': False, 'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid': 11, 'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 13, 30, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None, 'priority': 0}]
an object <generator object BuildRequestsEndpoint.get.<locals>.<listcomp>
of type <class 'generator'>
Yep.
On Python 2, [(yield self.db2data(br)) for br in buildrequests] is a list comprehension. It will have len(buildrequests) elements and each will be the value sent back in to the generator via the yield expression.
On Python 3, the same expression is a list of one element which is a generator expression.
results = []
results.append((yield self.db2data(br)))
defer.returnValue(results)
Wow, thanks! That fixed it. I submitted those fixes upstream to buildbot.
I'm not so familiar with generator expressions and Deferred's so have been playing
around with things to try to understand.
defer.returnValue(
[(yield self.db2data(br)) for br in buildrequests])
defer.returnValue(
list([(yield self.db2data(br)) for br in buildrequests])
[ Deferred ]
[ {......} ]
where the Deferred.results value was the dictionary.
How does Python 2 directly go to returning the dictionary out of the Deferred,
while Python 3 returns the Deferred inside the list? self.db2data() returns a Deferred.
I've encountered this before and quickly worked around it, but I think this might actually be a bug in python 3, or at least its documentation.
Post by Craig Rodrigues
[(yield 1) for x in range(10)]
<generator object <listcomp> at 0x10cd210f8>

That is a comprehension enclosed in square brackets, but I'm getting a generator object out rather than a list.

Is there a PEP for this that just hasn't updated the language reference?

-glyph
Glyph Lefkowitz
2017-01-24 21:17:25 UTC
Permalink
I had intended to paste the reference, which is here: https://docs.python.org/2/reference/expressions.html#list-displays <https://docs.python.org/2/reference/expressions.html#list-displays>

-glyph
Phil Mayers
2017-01-25 18:06:59 UTC
Permalink
Post by Glyph Lefkowitz
I've encountered this before and quickly worked around it, but I think
this might actually be a bug in python 3, or at least its documentation.
The language docs officially say that a "list display" (which is what I
believe we're looking at here) "yields a new list object". But that is
[(yield 1) for x in range(10)]
Related, see:

http://stackoverflow.com/questions/32139885/yield-in-list-comprehensions-and-generator-expressions

http://bugs.python.org/issue10544

Basically, don't use yield inside comprehensions if you don't want
weirdness, AFAICT :o/
Craig Rodrigues
2017-01-26 09:45:37 UTC
Permalink
Post by Phil Mayers
http://stackoverflow.com/questions/32139885/yield-in-list-
comprehensions-and-generator-expressions
http://bugs.python.org/issue10544
Basically, don't use yield inside comprehensions if you don't want
weirdness, AFAICT :o/
Yes, that's the exact issue!
Pointed out to me on the python-dev list:

https://mail.python.org/pipermail/python-dev/2017-January/147242.html

My vote is that this is a bug in Python.

--
Craig
Glyph Lefkowitz
2017-01-26 22:34:04 UTC
Permalink
http://stackoverflow.com/questions/32139885/yield-in-list-comprehensions-and-generator-expressions <http://stackoverflow.com/questions/32139885/yield-in-list-comprehensions-and-generator-expressions>
http://bugs.python.org/issue10544 <http://bugs.python.org/issue10544>
Basically, don't use yield inside comprehensions if you don't want weirdness, AFAICT :o/
Yes, that's the exact issue!
https://mail.python.org/pipermail/python-dev/2017-January/147242.html <https://mail.python.org/pipermail/python-dev/2017-January/147242.html>
My vote is that this is a bug in Python.
Discussing on that bug, I discovered that this is sort-of fixed in python 3.6; not for generators, but for coroutines. You can have an 'async def' function that does 'await' inside a generator and it should work.

Note that this is new in 3.6, and doesn't work in 3.5.

-glyph

Craig Rodrigues
2017-01-26 09:41:51 UTC
Permalink
Post by Craig Rodrigues
On Mon, Jan 23, 2017 at 5:10 PM, Jean-Paul Calderone <
Post by Jean-Paul Calderone
Post by Craig Rodrigues
I did some more debugging. The callstack is quite deep. :/
trial buildbot.test.unit.test_process_buildrequestdistributor.Tes
tMaybeStartBuilds.test_slow_db
I found in this function in buildbot's BuildRequestsEndpoint.get()
function ( https://github.com/buildbot/buildbot/blob/master/master/buil
* defer.returnValue( [(yield self.db2data(br)) for br
in buildrequests])*
an object list
[{'buildrequestid': 10, 'complete': False, 'waited_for': False,
'claimed_at': None, 'results': -1, 'claimed': False, 'buildsetid': 11,
'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 12, 6,
40, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
11, 'complete_at': None, 'submitted_at': datetime.datetime(1970, 1, 2, 13,
30, tzinfo=tzutc()), 'builderid': 77, 'claimed_by_masterid': None,
'priority': 0}]
an object <generator object BuildRequestsEndpoint.get.<loc
als>.<listcomp>
of type <class 'generator'>
Yep.
On Python 2, [(yield self.db2data(br)) for br in buildrequests] is a list
comprehension. It will have len(buildrequests) elements and each will be
the value sent back in to the generator via the yield expression.
On Python 3, the same expression is a list of one element which is a
generator expression.
This form is much clearer, I think, and behaves as intended on both
results = []
results.append((yield self.db2data(br)))
defer.returnValue(results)
Wow, thanks! That fixed it. I submitted those fixes upstream to buildbot.
I'm not so familiar with generator expressions and Deferred's so have
been playing
around with things to try to understand.
* defer.returnValue( [(yield self.db2data(br)) for br in
buildrequests])*
* defer.returnValue(*
* list([**(yield self.db2data(br)) for br in buildrequests])*
[ Deferred ]
[ {......} ]
where the Deferred.results value was the dictionary.
How does Python 2 directly go to returning the dictionary out of the
Deferred,
while Python 3 returns the Deferred inside the list? self.db2data()
returns a Deferred.
Thanks.
--
Craig
On the python-dev mailing list, Joe Jevnik gave a really excellent
explanation of what
is going on here, down to the Python bytecode level:

https://mail.python.org/pipermail/python-dev/2017-January/147244.html


--
Craig
Loading...