The Theo Spears Blog

Blogging Considered Harmful (Considered Harmful)?

PlanetPlanet with HTTPS feeds

First posted 2008-12-25 22:17:00.000003+00:00

PlanetPlanet is a popular RSS aggregator written in python, that I use to maintain Planet HashPHP. Unfortunately, as Andy Millar recently discovered, by default it does not support https URLs. This is because planetplanet uses a wrapper for the python socket library so that connection requests time out after a reasonable time (20 secs by default). This library is incompatible with the python ssl() call used for https connections. There are two possible solutions

Disable the timeoutsocket library

If you disable socket timeouts, planetplanet does not load the timeoutsocket library so everything works fine. To do this add 'feed_timeout = 0' to the [Planet] section of your configuration file.

Advantages: Shouldn't break anything, requires no source changes

Disadvantages: If any of the hosts you aggregate go down, planet will take a long time to run

Patch the timeoutsocket library

It's straightforward to hack the timeoutsocket library to work with SSL. Edit your planet/timeoutsocket.py file. The end should look like this

# end TimeoutFile

#
# Silently replace the socket() builtin function with
# our timeoutsocket() definition.
#
if not hasattr(socket, "_no_timeoutsocket"):
    socket._no_timeoutsocket = socket.socket
    socket.socket = timeoutsocket
del socket
socket = timeoutsocket
#   Finis

Change it to look like this

# end TimeoutFile

#
# Also use a replacement ssl function to unwrap the object
#
if hasattr(socket,'ssl'):
    _realssl = socket.ssl
    def ssl(sock, keyfile=None, certfile=None):
        if hasattr(sock, "_sock"):
            sock = sock._sock
        return _realssl(sock, keyfile, certfile)

#
# Silently replace the socket() builtin function with
# our timeoutsocket() definition.
#
if not hasattr(socket, "_no_timeoutsocket"):
    socket._no_timeoutsocket = socket.socket
    socket.socket = timeoutsocket
    socket.ssl = ssl
del socket
socket = timeoutsocket
#   Finis

Advantages: You still have timeout support for if a host goes down, seems to work well for me

Disadvantages: This may have side effects I don't fully understand so could break horribly