Discussion:
[Twisted-Python] Windows spawnProcess - Child process inherits files and sockets (POSIX does not)
exvito here
2016-02-22 03:14:06 UTC
Permalink
Hello all,

I filed https://twistedmatrix.com/trac/ticket/7970 last summer while
focused on a particular project that was hit by it. Since then I had to
attend to other things and I'm now back to refocusing on that project.

The subject says most of it, but the ticket has more information, including
why this isn't an issue on Python >= 3.4 ( my project is on 2.7...) :/

Before submitting a patch for review, I'm looking for preliminary feedback,
assuming you agree that the Windows vs POSIX semantics should be the same
(if not, why?).

My patch calls a few Windows APIs via ctypes, however, as far as I can
tell, Twisted on Windows requires pywin32 and, recently, there has been
some discussion around dropping that dependency and moving towards
something based on cffi.

What would you say the way forward is? Should I submit the patch for review
anyway? Is there any other work that needs to be done first that I may
contribute to?

Thanks in advance for Twisted and for your feedback,
--
exvito
Glyph
2016-03-31 21:27:08 UTC
Permalink
Post by exvito here
Hello all,
I filed https://twistedmatrix.com/trac/ticket/7970 <https://twistedmatrix.com/trac/ticket/7970> last summer while focused on a particular project that was hit by it. Since then I had to attend to other things and I'm now back to refocusing on that project.
Thanks for following up. Sorry I've taken so long to respond; this is a highly complex issue that is tempting to just punt on thinking about. But we really should address it, and thank you for writing a great bug report.
Post by exvito here
Before submitting a patch for review, I'm looking for preliminary feedback, assuming you agree that the Windows vs POSIX semantics should be the same (if not, why?).
After much thought: Yes. They should be the same. The reason they're not is largely ignorance of the relevant APIs and abstractions on Windows, not any desire that they differ. The one place they have to differ a little bit is handle inheritance: we need to figure out some way to express the 'childFDs' mapping in terms of both file descriptors and handles.
Post by exvito here
My patch calls a few Windows APIs via ctypes, however, as far as I can tell, Twisted on Windows requires pywin32 and, recently, there has been some discussion around dropping that dependency and moving towards something based on cffi.
ctypes is dangerous and error-prone. If you screw up the bit-width of a type, or the type in a header changes on some future version, ctypes gives you no way of finding out until some poor user's process segfaults, and usually not at the relevant call site. So we'd prefer not to maintain more ctypes-using code.

The APIs in pywin32 very closely mirror the underlying Windows API, so for addressing this issue, please just go ahead and use pywin32 APIs; porting them to a new API along with everything else should be relatively straightforward.

If we do move forward with that change, we will probably use https://pypi.python.org/pypi/pywincffi and not move anything within Twisted.
Post by exvito here
What would you say the way forward is? Should I submit the patch for review anyway? Is there any other work that needs to be done first that I may contribute to?
Yes, just go ahead and write the patch.
Post by exvito here
Thanks in advance for Twisted and for your feedback,
Thanks for using Twisted, thanks doubly for trying to contribute, and very sorry for the slow response.

-glyph
exvito here
2016-04-05 21:11:19 UTC
Permalink
Post by exvito here
Before submitting a patch for review, I'm looking for preliminary
feedback, assuming you agree that the Windows vs POSIX semantics should be
the same (if not, why?).
After much thought: Yes. They should be the same. The reason they're not
is largely ignorance of the relevant APIs and abstractions on Windows, not
any desire that they differ. The one place they have to differ a little
bit is handle inheritance: we need to figure out some way to express the
'childFDs' mapping in terms of both file descriptors and handles.
Agreed. My code does not go into 'childFDs' mapping territory, though. All
it does is prevent all file- and socket-handles from being inherited by the
child process -- other than the pipes used for STDIO communication.

My patch calls a few Windows APIs via ctypes, however, as far as I can
Post by exvito here
tell, Twisted on Windows requires pywin32 and, recently, there has been
some discussion around dropping that dependency and moving towards
something based on cffi.
ctypes is dangerous and error-prone. If you screw up the bit-width of a
type, or the type in a header changes on some future version, ctypes gives
you no way of finding out until some poor user's process segfaults, and
usually not at the relevant call site. So we'd prefer not to maintain more
ctypes-using code.
The APIs in pywin32 very closely mirror the underlying Windows API, so for
addressing this issue, please just go ahead and use pywin32 APIs; porting
them to a new API along with everything else should be relatively
straightforward.
If we do move forward with that change, we will probably use
https://pypi.python.org/pypi/pywincffi and not move anything within
Twisted.
Agreed with your ctypes comment -- I've been hit by such faults which
"magically" went away using cffi when coding against Windows TAPI.

pywin32, unfortunatelly, does not include two Windows APIs (out of four) my
code requires -- I just grepped the source for latest release I could find
on SourceForge, 220.

For completeness, the missing APIs are NtQuerySystemInformation [1] and
NtQueryObject [2].
The others are GetHandleInformation [3] and SetHandleInformation [4].

What would you say the way forward is? Should I submit the patch for review
Post by exvito here
anyway? Is there any other work that needs to be done first that I may
contribute to?
Yes, just go ahead and write the patch.
Given that pywin32 does not provide two of the required APIs, maybe this
issue is somewhat blocked.

Adding to that is the fact that one particular API call in my code --
NtQuerySystemInformation [1] -- is being used with what seems to be an
undocumented option -- SystemHandleInformation (enum = 16) -- and
returning, again, an apparently undocumented data structure --
SYSTEM_HANDLE_INFORMATION. I downloaded and installed the available SDKs
and WDKs (driver dev kits) from Microsoft and could not find any reference
to those particular options or data structures.

My code was created after much investigation on how to obtain the list of
open handles for the current process.
The gist of it is:
- Call NtQuerySystemInformation with the SystemInformationClass arg set
to SystemHandleInformation.
- This returns all (!!!) of the handles in the system (no need for special
privileges).
- Filter those out by the current process PID and type, such that only
files and sockets are left.
- Use the GetHandleInformation to get the inheritance flag and clear it
with SetHandleInformation if needed.

It is mostly based on SysInternals information at
http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html.
There are many other references across the web to those undocumented
options and data structures. But none of those come from formal Microsoft
documents, that I could find.

An alternative approach, which I also tried, for completeness sake, was to
try and Get/SetHandleInformation on all possible handles -- this is
completely unfeasible given that handles are at least 32 bit numbers.

After all of this -- including some frustration, I confess -- I decided to
go ahead and create a cffi ABI-mode variation of my original patch, anyway:
it passes the same tests and, much like the ctypes approach, works nicely
on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python 2.7.11
32, Twisted 16.1, cffi 1.5.2.

Just for kicks I compared the performance of the ctypes vs cffi
implementation:
- The ctypes code runs in 0.014s - 0.016s.
- The cffi code runs in 0.03s - 0.04s.

This makes sense given that the code is mostly calling out to DLLs and,
AFAICT, cffi does the nice extra work of validating/converting types back
and forth.


Wrapping up: I'm really not sure how to more forward with this: not only
does pywin32 not provide the needed APIs, but also one of those APIs --
documented -- is being used in an undocumented fashion.

Even though I'd love to submit a patch, I don't think we're at that point
yet. However, for posterity's sake and if anyone wants to take a look at
the code, it is avalable at https://github.com/exvito/twisted in branches
win32-fix-handle-inherit-cffi and win32-fix-handle-inherit-ctypes. They add
two tests to twisted/test/test_process.py, one line to
twisted/internet/_dumbwin32proc.py and one module named
twisted/internet/_win32handleinherit.py

I'd love to hear feedback or ideas on this.
Thanks again

[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724509(v=vs.85).aspx
[2] https://msdn.microsoft.com/en-us/library/bb432383(v=vs.85).aspx
[3]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724329(v=vs.85).aspx
[4]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724935(v=vs.85).aspx
--
exvito
Oliver Palmer
2016-04-07 03:57:18 UTC
Permalink
Owner of pywincffi here, I'd certainly welcome a PR or two for pywincffi
with the necessary functions/constants/etc so you don't have to manage that
code and I'd be happy to help write it too. I think the consensus is
Twisted is going to eventually replace calls into pywin32 with calls into
pywincffi rather than implement all of that code inside of Twisted itself.
I've already started doing this in a couple of places,
twisted.python.lockfile
<https://github.com/twisted/twisted/compare/trunk...opalmer:windows-cffi>
being the one I'm actively working on because it's simpler to start with.
But the code inside of dumbwin32proc.py and _win32handleinherit.py are both
high on my list to convert too so it probably makes sense that we work on
this together if you're open to it.

Anyway as for some of your questions and points I've tried to reply to some
of them below. I will admit, it's slanted towards making as much of this a
possible a part of pywincffi having plugged away at the 'what to do about
pywin32' and 'how to make Windows API calls more pythonic' quite a bit now
so keep that in mind.
Post by exvito here
After all of this -- including some frustration, I confess -- I decided
to go ahead and create a cffi ABI-mode variation of my original patch,
anyway: it passes the same tests and, much like the ctypes approach, works
nicely on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python
2.7.11 32, Twisted 16.1, cffi 1.5.2.

Looking at your code, some of it could be put into pywincffi already. It
would need more tests and some additional code so the API calls are closer
to what's already in the project (type checking, default arguments,
documentation, etc) but overall it seems like you've already done the major
work of understanding how it all fits together. The other advantages of
putting this code into pywincffi is testing and releases are easier because
the project is using AppVeyor to test all PRs and build the wheel files for
most major Python versions including both 32 and 64 bit variants. From
Twisted's perspective, it's just a dependency on another library.
Post by exvito here
Just for kicks I compared the performance of the ctypes vs cffi
implementation: ...

Have you tried a comparison between out-of-line modules and those using
dlopen? I imagine they'd end up being pretty similar in the end
performance wise but I am a little curious. In pywincffi I started out
using dlopen but moved away from it because I needed to write some extra
code which couldn't be included from a header. The other advantage I saw
is that you don't have to rely on the DLL being present and/or Windows
being able to locate it so you can include code which might only
be available if you have some extra library installed.
Post by exvito here
This makes sense given that the code is mostly calling out to DLLs and,
AFAICT, cffi does the nice extra work of validating/converting types back
and forth.

It usually does handle type conversion nicely but it can also do odd things
depending on the Python version (like bytes vs. unicode vs. strings) if
you're not careful. The other issue is user input which can sometimes
result in odd error messages at the C-level that are difficult to
understand or correct without specific knowledge of the underlying code.
In non-public APIs this is not as big of an issue because the code is task
specific but in something like pywincffi or Twisted where people can use
APIs in unexpected ways the automatic conversion/validation can be somewhat
lacking (again, something I wanted to centralize and improve in pywincffi).
Post by exvito here
It is mostly based on SysInternals information at
http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html.
There are many other references across the web to those undocumented
options and data structures. But none of those come from formal Microsoft
documents, that I could find.

Some of the people on that site either have contacts within Microsoft or
have worked for Microsoft at one point so I usually trust what's there if
it's the only source. The other place I often look is the ReactOS project
where they've managed to reverse engineer quite a bit of the Windows kernel
which can either hint at the info you need or validate what you already
know.
Post by exvito here
Wrapping up: I'm really not sure how to more forward with this: not only
does pywin32 not provide the needed APIs, but also one of those APIs --
documented -- is being used in an undocumented fashion.

IMHO (again, with some bias), I think implementing the calls you need in
pywincffi is the first step. If the calls are undocumented it would also
be a good place to do the necessary research, testing and development I
think in isolation from Twisted itself so it's clear we're going in the
right direction. Once that's done a patch set for Twisted, which calls
into pywincffi, can be opened and tested across the supported platforms.
This makes the patch set smaller but also makes it easier to understand
what if anything the new code breaks.


Regardless, even if you don't want to go the route of putting this into
pywincffi thanks for working on this because it helps in some of the work
I'm doing too.

---Oliver
Post by exvito here
Post by exvito here
Before submitting a patch for review, I'm looking for preliminary
feedback, assuming you agree that the Windows vs POSIX semantics should be
the same (if not, why?).
After much thought: Yes. They should be the same. The reason they're
not is largely ignorance of the relevant APIs and abstractions on Windows,
not any desire that they differ. The one place they have to differ a
little bit is handle inheritance: we need to figure out some way to express
the 'childFDs' mapping in terms of both file descriptors and handles.
Agreed. My code does not go into 'childFDs' mapping territory, though. All
it does is prevent all file- and socket-handles from being inherited by the
child process -- other than the pipes used for STDIO communication.
My patch calls a few Windows APIs via ctypes, however, as far as I can
Post by exvito here
tell, Twisted on Windows requires pywin32 and, recently, there has been
some discussion around dropping that dependency and moving towards
something based on cffi.
ctypes is dangerous and error-prone. If you screw up the bit-width of a
type, or the type in a header changes on some future version, ctypes gives
you no way of finding out until some poor user's process segfaults, and
usually not at the relevant call site. So we'd prefer not to maintain more
ctypes-using code.
The APIs in pywin32 very closely mirror the underlying Windows API, so
for addressing this issue, please just go ahead and use pywin32 APIs;
porting them to a new API along with everything else should be relatively
straightforward.
If we do move forward with that change, we will probably use
https://pypi.python.org/pypi/pywincffi and not move anything within
Twisted.
Agreed with your ctypes comment -- I've been hit by such faults which
"magically" went away using cffi when coding against Windows TAPI.
pywin32, unfortunatelly, does not include two Windows APIs (out of four)
my code requires -- I just grepped the source for latest release I could
find on SourceForge, 220.
For completeness, the missing APIs are NtQuerySystemInformation [1] and
NtQueryObject [2].
The others are GetHandleInformation [3] and SetHandleInformation [4].
What would you say the way forward is? Should I submit the patch for
Post by exvito here
review anyway? Is there any other work that needs to be done first that I
may contribute to?
Yes, just go ahead and write the patch.
Given that pywin32 does not provide two of the required APIs, maybe this
issue is somewhat blocked.
Adding to that is the fact that one particular API call in my code --
NtQuerySystemInformation [1] -- is being used with what seems to be an
undocumented option -- SystemHandleInformation (enum = 16) -- and
returning, again, an apparently undocumented data structure --
SYSTEM_HANDLE_INFORMATION. I downloaded and installed the available SDKs
and WDKs (driver dev kits) from Microsoft and could not find any reference
to those particular options or data structures.
My code was created after much investigation on how to obtain the list of
open handles for the current process.
- Call NtQuerySystemInformation with the SystemInformationClass arg set
to SystemHandleInformation.
- This returns all (!!!) of the handles in the system (no need for special
privileges).
- Filter those out by the current process PID and type, such that only
files and sockets are left.
- Use the GetHandleInformation to get the inheritance flag and clear it
with SetHandleInformation if needed.
It is mostly based on SysInternals information at
http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html.
There are many other references across the web to those undocumented
options and data structures. But none of those come from formal Microsoft
documents, that I could find.
An alternative approach, which I also tried, for completeness sake, was to
try and Get/SetHandleInformation on all possible handles -- this is
completely unfeasible given that handles are at least 32 bit numbers.
After all of this -- including some frustration, I confess -- I decided to
it passes the same tests and, much like the ctypes approach, works nicely
on my environment: Win 7 32, Win 2008R2 64, Win XP 32 (!!!), Python 2.7.11
32, Twisted 16.1, cffi 1.5.2.
Just for kicks I compared the performance of the ctypes vs cffi
- The ctypes code runs in 0.014s - 0.016s.
- The cffi code runs in 0.03s - 0.04s.
This makes sense given that the code is mostly calling out to DLLs and,
AFAICT, cffi does the nice extra work of validating/converting types back
and forth.
Wrapping up: I'm really not sure how to more forward with this: not only
does pywin32 not provide the needed APIs, but also one of those APIs --
documented -- is being used in an undocumented fashion.
Even though I'd love to submit a patch, I don't think we're at that point
yet. However, for posterity's sake and if anyone wants to take a look at
the code, it is avalable at https://github.com/exvito/twisted in branches
win32-fix-handle-inherit-cffi and win32-fix-handle-inherit-ctypes. They add
two tests to twisted/test/test_process.py, one line to
twisted/internet/_dumbwin32proc.py and one module named
twisted/internet/_win32handleinherit.py
I'd love to hear feedback or ideas on this.
Thanks again
[1]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724509(v=vs.85).aspx
[2] https://msdn.microsoft.com/en-us/library/bb432383(v=vs.85).aspx
[3]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724329(v=vs.85).aspx
[4]
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724935(v=vs.85).aspx
--
exvito
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
exvito here
2016-04-07 17:09:04 UTC
Permalink
Post by Oliver Palmer
Owner of pywincffi here, I'd certainly welcome a PR or two for pywincffi
with the necessary functions/constants/etc so you don't have to manage that
code and I'd be happy to help write it too. I think the consensus is
Twisted is going to eventually replace calls into pywin32 with calls into
pywincffi rather than implement all of that code inside of Twisted itself.
I've already started doing this in a couple of places,
twisted.python.lockfile being the one I'm actively working on because it's
simpler to start with. But the code inside of dumbwin32proc.py and
_win32handleinherit.py are both high on my list to convert too so it
probably makes sense that we work on this together if you're open to it.
Certainly. Thanks for reaching out.
Post by Oliver Palmer
Looking at your code, some of it could be put into pywincffi already. It
would need more tests and some additional code so the API calls are closer
to what's already in the project (type checking, default arguments,
documentation, etc) but overall it seems like you've already done the major
work of understanding how it all fits together. The other advantages of
putting this code into pywincffi is testing and releases are easier because
the project is using AppVeyor to test all PRs and build the wheel files for
most major Python versions including both 32 and 64 bit variants. From
Twisted's perspective, it's just a dependency on another library.
Agreed. I'll be happy to follow your guidance in that.
Post by Oliver Palmer
Post by exvito here
Just for kicks I compared the performance of the ctypes vs cffi
implementation: ...
Have you tried a comparison between out-of-line modules and those using
dlopen? I imagine they'd end up being pretty similar in the end performance
wise but I am a little curious. In pywincffi I started out using dlopen but
moved away from it because I needed to write some extra code which couldn't
be included from a header. The other advantage I saw is that you don't have
to rely on the DLL being present and/or Windows being able to locate it so
you can include code which might only be available if you have some extra
library installed.
I did not and I confess that I haven't fully grasped (yet) the
different cffi "approaches", if I may call them so.
Post by Oliver Palmer
Post by exvito here
It is mostly based on SysInternals information at
http://forum.sysinternals.com/howto-enumerate-handles_topic18892.html. There
are many other references across the web to those undocumented options and
data structures. But none of those come from formal Microsoft documents,
that I could find.
Some of the people on that site either have contacts within Microsoft or
have worked for Microsoft at one point so I usually trust what's there if
it's the only source. The other place I often look is the ReactOS project
where they've managed to reverse engineer quite a bit of the Windows kernel
which can either hint at the info you need or validate what you already
know.
AFAICT, SysInternals was bought by / integrated with Microsoft: tteir
tools are now available under Microsoft domains (example: Process
Explorer at https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx).
Post by Oliver Palmer
Post by exvito here
Wrapping up: I'm really not sure how to more forward with this: not only
does pywin32 not provide the needed APIs, but also one of those APIs --
documented -- is being used in an undocumented fashion.
IMHO (again, with some bias), I think implementing the calls you need in
pywincffi is the first step. If the calls are undocumented it would also be
a good place to do the necessary research, testing and development I think
in isolation from Twisted itself so it's clear we're going in the right
direction. Once that's done a patch set for Twisted, which calls into
pywincffi, can be opened and tested across the supported platforms. This
makes the patch set smaller but also makes it easier to understand what if
anything the new code breaks.
Sounds like a sane and safe approach.
Post by Oliver Palmer
Regardless, even if you don't want to go the route of putting this into
pywincffi thanks for working on this because it helps in some of the work
I'm doing too.
Thank you to, again. I will issue PRs against pywincffi such that the
APIs are available. They most probably won't be up to pywincffi's
standards/requirements but I expect we can cooperate under that
context and go from there.
--
exvito
exvito here
2016-04-08 12:30:19 UTC
Permalink
Post by exvito here
Thank you to, again. I will issue PRs against pywincffi such that the
APIs are available. They most probably won't be up to pywincffi's
standards/requirements but I expect we can cooperate under that
context and go from there.
For completeness, if anyone wants to follow-along or contribute, first
PR is https://github.com/opalmer/pywincffi/pull/66.

Work will continue there. Once it becomes "Twisted ready" we'll bring
the discussion back here.

Thanks to all.
--
exvito
Loading...