Discussion:
[Twisted-Python] adbapi.ConnectionPool - reconnecting to postgresql
Jonathan Vanasco
2014-06-10 18:24:24 UTC
Permalink
I had to restart postgresql , and that triggered some endless errors in my logs.

i don't have anything that can handle a suddenly dropped / resumed connection.

does anyone have a recommendation for trying to reconnect or handle this in general

from what I can tell from my logs...

When I catch an error, my cleanup code tries to

self._connection.rollback()

which raises

psycopg2.InterfaceError: connection already closed

My immediate thoughts are:

1. catching the correct disconnect error
2. telling the connection/pool to reconnect
Phil Mayers
2014-06-10 19:02:23 UTC
Permalink
Post by Jonathan Vanasco
I had to restart postgresql , and that triggered some endless errors in my logs.
i don't have anything that can handle a suddenly dropped / resumed connection.
does anyone have a recommendation for trying to reconnect or handle this in general
from what I can tell from my logs...
When I catch an error, my cleanup code tries to
self._connection.rollback()
which raises
psycopg2.InterfaceError: connection already closed
Yes, this is irritating.

Related tickets see:

http://twistedmatrix.com/trac/ticket/4404
http://twistedmatrix.com/trac/ticket/4964

You absolutely want to be setting cp_reconnect=1 on ConnectionPool.
Anything else is asking for major trouble, and I really think this
should be the default.

cp_reconnect will not avoid all problems - for a thread pool of size N,
the next N transactions will have to fail to throw away all the dead
connections - but it might stop them happing forever.

In some other, non-Twisted contexts, people have had problem even with
reconnecting, because some versions of some DBAPI drivers emit a
connection from .close() if it's already closed, which interrupts the
reconnection logic :o(

It seems your version of psycopg is doing this - how old is it? I
thought that was fixed these days?

You might also look at something like pgbouncer, which you can configure
to try a "good" SQL query on the outbound connections before handing to
an inbound client; this probes the connections. It adds latency, but
personally I would far rather pay the small number of milliseconds on
"select 1" than bomb out to the upper layers.

An annoying problem.
Dustin J. Mitchell
2014-06-11 01:02:32 UTC
Permalink
SQLAlchemy makes this kind of pessimistic reconnection pretty easy; we
do it for MySQL in
https://github.com/buildbot/buildbot/blob/master/master/buildbot/db/enginestrategy.py
I'm not sure why we don't do it for Postgres, but we've never heard a
complaint from a postgres user, so perhaps other layers are handling
it.

In general, I consider this a requirement for a production-ready
database layer, along with correctly setting SO_KEEPALIVE.

Dustin
Post by Phil Mayers
Post by Jonathan Vanasco
I had to restart postgresql , and that triggered some endless errors in my logs.
i don't have anything that can handle a suddenly dropped / resumed connection.
does anyone have a recommendation for trying to reconnect or handle this in general
from what I can tell from my logs...
When I catch an error, my cleanup code tries to
self._connection.rollback()
which raises
psycopg2.InterfaceError: connection already closed
Yes, this is irritating.
http://twistedmatrix.com/trac/ticket/4404
http://twistedmatrix.com/trac/ticket/4964
You absolutely want to be setting cp_reconnect=1 on ConnectionPool. Anything
else is asking for major trouble, and I really think this should be the
default.
cp_reconnect will not avoid all problems - for a thread pool of size N, the
next N transactions will have to fail to throw away all the dead connections
- but it might stop them happing forever.
In some other, non-Twisted contexts, people have had problem even with
reconnecting, because some versions of some DBAPI drivers emit a connection
from .close() if it's already closed, which interrupts the reconnection
logic :o(
It seems your version of psycopg is doing this - how old is it? I thought
that was fixed these days?
You might also look at something like pgbouncer, which you can configure to
try a "good" SQL query on the outbound connections before handing to an
inbound client; this probes the connections. It adds latency, but personally
I would far rather pay the small number of milliseconds on "select 1" than
bomb out to the upper layers.
An annoying problem.
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Glyph
2014-06-11 05:23:56 UTC
Permalink
You might want to just try using SQLAlchemy via <https://pypi.python.org/pypi/alchimia> if you don't want to do your own thread-pool management as buildbot appears to be.

-glyph
Post by Dustin J. Mitchell
SQLAlchemy makes this kind of pessimistic reconnection pretty easy; we
do it for MySQL in
https://github.com/buildbot/buildbot/blob/master/master/buildbot/db/enginestrategy.py
I'm not sure why we don't do it for Postgres, but we've never heard a
complaint from a postgres user, so perhaps other layers are handling
it.
In general, I consider this a requirement for a production-ready
database layer, along with correctly setting SO_KEEPALIVE.
Dustin
Post by Phil Mayers
Post by Jonathan Vanasco
I had to restart postgresql , and that triggered some endless errors in my logs.
i don't have anything that can handle a suddenly dropped / resumed connection.
does anyone have a recommendation for trying to reconnect or handle this in general
from what I can tell from my logs...
When I catch an error, my cleanup code tries to
self._connection.rollback()
which raises
psycopg2.InterfaceError: connection already closed
Yes, this is irritating.
http://twistedmatrix.com/trac/ticket/4404
http://twistedmatrix.com/trac/ticket/4964
You absolutely want to be setting cp_reconnect=1 on ConnectionPool. Anything
else is asking for major trouble, and I really think this should be the
default.
cp_reconnect will not avoid all problems - for a thread pool of size N, the
next N transactions will have to fail to throw away all the dead connections
- but it might stop them happing forever.
In some other, non-Twisted contexts, people have had problem even with
reconnecting, because some versions of some DBAPI drivers emit a connection
from .close() if it's already closed, which interrupts the reconnection
logic :o(
It seems your version of psycopg is doing this - how old is it? I thought
that was fixed these days?
You might also look at something like pgbouncer, which you can configure to
try a "good" SQL query on the outbound connections before handing to an
inbound client; this probes the connections. It adds latency, but personally
I would far rather pay the small number of milliseconds on "select 1" than
bomb out to the upper layers.
An annoying problem.
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
_______________________________________________
Twisted-Python mailing list
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Jan Urbański
2014-06-11 08:01:47 UTC
Permalink
Post by Glyph
You might want to just try using SQLAlchemy via
<https://pypi.python.org/pypi/alchimia> if you don't want to do your own
thread-pool management as buildbot appears to be.
Or you can go threadless and use txpostgres, which now has automatic
reconnection handling (dislaimer: I'm the author):

http://txpostgres.readthedocs.org/en/latest/usage.html#automatic-reconnection

Jan
Jonathan Vanasco
2014-06-11 17:21:25 UTC
Permalink
Post by Glyph
You might want to just try using SQLAlchemy via <https://pypi.python.org/pypi/alchimia> if you don't want to do your own thread-pool management as buildbot appears to be.
thanks.

i think we might need to go the buildbot route.

we mainly use SqlAlchemy ; our twisted apps are under 5% of our code.

alchimia looks to be headed in the direction of what we always wanted ( to get sqlalchemy in twisted ), but not quite there yet. i think it would be too confusing right now to have 2 separate sqlalchemy packages in-house.
Glyph
2014-06-12 05:13:14 UTC
Permalink
Post by Jonathan Vanasco
Post by Glyph
You might want to just try using SQLAlchemy via <https://pypi.python.org/pypi/alchimia> if you don't want to do your own thread-pool management as buildbot appears to be.
thanks.
i think we might need to go the buildbot route.
we mainly use SqlAlchemy ; our twisted apps are under 5% of our code.
alchimia looks to be headed in the direction of what we always wanted ( to get sqlalchemy in twisted ), but not quite there yet. i think it would be too confusing right now to have 2 separate sqlalchemy packages in-house.
alchimia is not a re-implementation or a separate sqlalchemy package. It *is* sqlalchemy; it's just a slight extension for integration sqlalchemy operations with the Twisted main loop.

-glyph
Jonathan Vanasco
2014-06-12 18:35:11 UTC
Permalink
Post by Glyph
alchimia is not a re-implementation or a separate sqlalchemy package. It *is* sqlalchemy; it's just a slight extension for integration sqlalchemy operations with the Twisted main loop.
Yeah, I get that. I'm really excited for it.

I use the declarative method (orm) for setting up the structure -- even when not using the ORM / Session. It's just easier to write classes and maintain a single package for ORM & non-orm usage by using the ORM to set up the metadata.

there's an unaddressed ticket about that in the alchemia package.

i also noticed the line "Some methods simply haven't been implemented yet. If you file a bug, we'll implement them! "

looking through the source and docs, there seemed to be no savepoint/begin_nested support yet.

this looks like the package I want in the future -- i'm just uneasy about what might/might not be supported, on top of the migration work.

the buildbot option looks like a better alternative for now -- we don't incur much more technical debt, and free up more time for a proper solution later.
Glyph
2014-06-13 07:27:23 UTC
Permalink
Post by Jonathan Vanasco
this looks like the package I want in the future -- i'm just uneasy about what might/might not be supported, on top of the migration work.
This is understandable; alchimia is definitely a bit raw. I'd just encourage you to actually file those bugs and participate in its community. Unless some interested parties like you actually start doing that, it will just languish in this state forever :).

-glyph

Loading...