<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5663931432891696372</id><updated>2011-11-27T18:30:38.117-05:00</updated><title type='text'>Python Quirks</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>25</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-1947041560595922158</id><published>2011-04-02T20:56:00.001-04:00</published><updated>2011-09-13T10:08:40.840-04:00</updated><title type='text'>Twisted: Asynchronous HTTP Request</title><content type='html'>&lt;p&gt;
Note that &lt;a href="http://twistedmatrix.com/documents/current/web/howto/client.html"&gt;how to make an HTTP request with Twisted is already documented&lt;/a&gt;.  But, unless you're already familiar with &lt;a href="http://twistedmatrix.com/"&gt;Twisted&lt;/a&gt;, my guess is that extending the example code to downloading a large number of web pages with a limit on the number of simultaneous requests is not easy.  Below, you'll find example code for exactly that.  Below the code is a walk-through that will hopefully help you understand the details.
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;
from pprint import pformat

from twisted.internet import reactor
import twisted.internet.defer
from twisted.internet.protocol import Protocol
from twisted.web.client import Agent
from twisted.web.http_headers import Headers

class PrinterClient(Protocol):
    def __init__(self, whenFinished):
        self.whenFinished = whenFinished

    def dataReceived(self, bytes):
        print '##### Received #####\n%s' % (bytes,)

    def connectionLost(self, reason):
        print 'Finished:', reason.getErrorMessage()
        self.whenFinished.callback(None)

def handleResponse(r):
    print "version=%s\ncode=%s\nphrase='%s'" % (r.version, r.code, r.phrase)
    for k, v in r.headers.getAllRawHeaders():
        print "%s: %s" % (k, '\n  '.join(v))
    whenFinished = twisted.internet.defer.Deferred()
    r.deliverBody(PrinterClient(whenFinished))
    return whenFinished

def handleError(reason):
    reason.printTraceback()
    reactor.stop()

def getPage(url):
    print "Requesting %s" % (url,)
    d = Agent(reactor).request('GET', url, Headers({'User-Agent': 'twisted']}), None)
    d.addCallbacks(handleResponse, handleError)
    return d

semaphore = twisted.internet.defer.DeferredSemaphore(2)
dl = list()
dl.append(semaphore.run(getPage, 'http://google.com'))
dl.append(semaphore.run(getPage, 'http://cnn.com'))
dl.append(semaphore.run(getPage, 'http://nytimes.com'))
dl = twisted.internet.defer.DeferredList(dl)
dl.addCallbacks(lambda x: reactor.stop(), handleError)

reactor.run()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;
&lt;code&gt;getPage&lt;/code&gt; handles an entire single HTTP request.  &lt;code&gt;Agent(reactor).request()&lt;/code&gt; creates an &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.web.client.Agent.html"&gt;&lt;code&gt;Agent&lt;/code&gt;&lt;/a&gt; and sends the HTTP request.  &lt;code&gt;request()&lt;/code&gt; returns a &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.defer.Deferred.html"&gt;deferred&lt;/a&gt; which is fired when the &lt;b&gt;headers&lt;/b&gt; are retrieved.  The &lt;code&gt;addCallbacks&lt;/code&gt; line specifies that &lt;code&gt;handleResponse&lt;/code&gt; is called upon successful header retrieval and &lt;code&gt;handleError&lt;/code&gt; is called if there is an error in retrieving the headers.
&lt;/p&gt;
&lt;p&gt;
&lt;code&gt;handleResponse&lt;/code&gt; is given a &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.web.client.Response.html"&gt;&lt;code&gt;Response&lt;/code&gt;&lt;/a&gt; object which contains the HTTP header and includes a method, &lt;code&gt;deliverBody&lt;/code&gt;, to specify a &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IProtocol.html"&gt;&lt;code&gt;Protocol&lt;/code&gt;&lt;/a&gt; to handle delivery of the HTTP body.  A &lt;code&gt;Protocol&lt;/code&gt; is used for body delivery because it may come in chunks and an error may occur in the middle of delivery (e.g. someone pulls your network plug).  &lt;code&gt;PrinterClient&lt;/code&gt; is a very simple &lt;code&gt;Protocol&lt;/code&gt; which (1) prints received data, (2) logs the reason for termination (if not &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.web.client.ResponseDone.html"&gt;&lt;code&gt;twisted.web.client.ResponseDone&lt;/code&gt;&lt;/a&gt;, there was an error), and (3) fires a deferred &lt;code&gt;whenFinished&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;
The trickiest part of this code is following the &lt;code&gt;Deferred&lt;/code&gt; chain, which is essential to understanding how we limit the maximum number of outstanding requests.  A key point to understand about &lt;code&gt;Deferred&lt;/code&gt;s is that, if a callback returns a &lt;code&gt;Deferred&lt;/code&gt;, the parent &lt;code&gt;Deferred&lt;/code&gt; waits for the child &lt;code&gt;Deferred&lt;/code&gt; to fire before handing a value to the next &lt;code&gt;Deferred&lt;/code&gt; in the chain.  See documentation on &lt;a href="http://twistedmatrix.com/documents/current/core/howto/defer.html#auto12"&gt;Chaining Deferreds&lt;/a&gt;.  Because of this, each &lt;code&gt;semaphore.run&lt;/code&gt; waits for the &lt;code&gt;PrinterClient&lt;/code&gt; protocol to complete before releasing its semaphore.  The &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.defer.DeferredSemaphore.html"&gt;&lt;code&gt;DeferredSemaphore&lt;/code&gt;&lt;/a&gt; is basically a &lt;code&gt;Deferred&lt;/code&gt;-aware semaphore.  It's only argument is the number of tokens it allows to be "checked-out" simultaneously.  When we make the &lt;tt&gt;nytimes.com&lt;/tt&gt; &lt;code&gt;semaphore.run&lt;/code&gt; call, the semaphore doesn't call &lt;code&gt;getPage&lt;/code&gt; until one of the other requests has completed.
&lt;/p&gt;
&lt;p&gt;
The &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.defer.DeferredList.html"&gt;&lt;code&gt;DeferredList&lt;/code&gt;&lt;/a&gt; is used to clean-up after all requests have completed.  Under normal circumstances, we just want to stop the reactor so our process will exit.  But, if there is an error, we want to see what happened, hence we use &lt;code&gt;handleError&lt;/code&gt; in that case.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Update 9/13/11:&lt;/b&gt; Minor code formatting change.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-1947041560595922158?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/1947041560595922158/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1947041560595922158'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1947041560595922158'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html' title='Twisted: Asynchronous HTTP Request'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-1818962188406160125</id><published>2011-03-09T14:18:00.004-05:00</published><updated>2011-03-09T16:26:00.055-05:00</updated><title type='text'>Twisted: Beware: Returning a Value from dataReceived</title><content type='html'>&lt;p&gt;
We just lost approximately 10-man-hours to undocumented behavior of Twisted.  If you return a &lt;a href="http://docs.python.org/library/stdtypes.html#truth-value-testing"&gt;True truth value&lt;/a&gt; from your &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.protocol.Protocol.html#dataReceived"&gt;dataReceived&lt;/a&gt; function (after it is called by the reactor), the reactor will destroy your protocol, and close the corresponding connection.  Fortunately, this behavior is &lt;a href="http://twistedmatrix.com/trac/ticket/2491"&gt;recognized as a bug&lt;/a&gt; and a deprecation warning will be likely be issued with this behavior in the 11.0 release.  But, since many of us are stuck with Twisted 10.2 or earlier for months, if not years, to come, it's good to be aware of this issue.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-1818962188406160125?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/1818962188406160125/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2011/03/twisted-beware-returning-value-from.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1818962188406160125'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1818962188406160125'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2011/03/twisted-beware-returning-value-from.html' title='Twisted: Beware: Returning a Value from dataReceived'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-7473372415473110693</id><published>2011-02-25T15:55:00.001-05:00</published><updated>2011-02-25T15:55:56.746-05:00</updated><title type='text'>If you Misspell "protocol", this is what you get</title><content type='html'>&lt;p&gt;
&lt;code&gt;&lt;pre&gt;
Traceback (most recent call last):
  File "/usr/lib/python2.5/site-packages/twisted/python/log.py", line 51, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/usr/lib/python2.5/site-packages/twisted/python/log.py", line 36, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/usr/lib/python2.5/site-packages/twisted/python/context.py", line 59, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/lib/python2.5/site-packages/twisted/python/context.py", line 37, in callWithContext
    return func(*args,**kw)
--- &lt;exception caught here&gt; ---
  File "/usr/lib/python2.5/site-packages/twisted/internet/selectreactor.py", line 146, in _doReadOrWrite
    why = getattr(selectable, method)()
  File "/usr/lib/python2.5/site-packages/twisted/internet/tcp.py", line 563, in doConnect
    self._connectDone()
  File "/usr/lib/python2.5/site-packages/twisted/internet/tcp.py", line 566, in _connectDone
    self.protocol = self.connector.buildProtocol(self.getPeer())
  File "/usr/lib/python2.5/site-packages/twisted/internet/base.py", line 930, in buildProtocol
    return self.factory.buildProtocol(addr)
  File "/usr/lib/python2.5/site-packages/twisted/internet/protocol.py", line 98, in buildProtocol
    p = self.protocol()
&lt;/pre&gt;&lt;/code&gt;
This is worth remembering.  Note that nothing here refers to the corresponding factory or the code where the error was made.  I think this is one reason &lt;a href="http://twistedmatrix.com/"&gt;Twisted&lt;/a&gt; can be frustrating.
&lt;/p&gt;
&lt;p&gt;
You can see this error with an simple example:
&lt;code&gt;&lt;pre&gt;
from twisted.internet import protocol
from twisted.internet import reactor
class MyProtocol(protocol.Protocol):
    pass
class MyFactory(protocol.ReconnectingClientFactory):
    protcol = MyProtocol
reactor.connectTCP('google.com', 80, MyFactory())
reactor.run()
&lt;/pre&gt;&lt;/code&gt;
It's certainly convenient to be able to set the protocol so simply, but it's disappointing that the error isn't caught at the source.  I wonder why the factories don't have an &lt;code&gt;__init__&lt;/code&gt; method that checks for a valid &lt;code&gt;protocol&lt;/code&gt; field?
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-7473372415473110693?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/7473372415473110693/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2011/02/if-you-misspell-protocol-this-is-what.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7473372415473110693'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7473372415473110693'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2011/02/if-you-misspell-protocol-this-is-what.html' title='If you Misspell &quot;protocol&quot;, this is what you get'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-3813140960308001586</id><published>2011-01-26T11:39:00.001-05:00</published><updated>2011-01-26T11:41:57.461-05:00</updated><title type='text'>Using exceptions for goto</title><content type='html'>&lt;p&gt;
Goto is a shunned construct in modern programming languages, but occasionally there is a case where it makes sense, such as breaking-out from a set of nested &lt;code&gt;for&lt;/code&gt; statements when a solution is found.  But, modern programming languages contain a better construct for such cases---exceptions.  For example, say we are trying to find an item from each of three sets which jointly satisfy some criterion.  A naive implementation might look like:
&lt;code&gt;&lt;pre&gt;
foundMatch = False
for item1 in set1:
    for item2 in set2:
        for item3 in set3:
            if satisfiesCriterion(item1, item2, item3):
                foundMatch = True
                break
        if foundMatch:
            break
    if foundMatch:
        break
&lt;/pre&gt;&lt;/code&gt;
But, this code can be simplified by introducing an exception:
&lt;code&gt;&lt;pre&gt;
try:
    for item1 in set1:
        for item2 in set2:
            for item3 in set3:
                if satisfiesCriterion(item1, item2, item3):
                    raise FoundMatch()
except FoundMatch:
    pass
&lt;/pre&gt;&lt;/code&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-3813140960308001586?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/3813140960308001586/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2011/01/using-exceptions-for-goto.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/3813140960308001586'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/3813140960308001586'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2011/01/using-exceptions-for-goto.html' title='Using exceptions for goto'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-2108149138072281831</id><published>2011-01-20T10:00:00.003-05:00</published><updated>2011-01-20T10:16:53.166-05:00</updated><title type='text'>Twisted Documentation</title><content type='html'>&lt;p&gt;
There is currently much &lt;a href="http://permalink.gmane.org/gmane.comp.python.twisted/22102"&gt;discussion on the twisted mailing list about improving twisted documentation&lt;/a&gt;.  I'm one of many who think the documentation could be improved.  I found a major problem to be a lack of introduction to the twisted mental model---the fact that it uses cooperative timesharing and blocking calls to handle events.
&lt;/p&gt;
&lt;p&gt;
Victor Norman suggested &lt;a href="http://krondo.com/blog/?page_id=1327"&gt;Dave Peticolas' Twisted Introduction&lt;/a&gt;.  Reading &lt;a href="http://krondo.com/blog/?p=1209"&gt;the first article which explains the Twisted "mental model"&lt;/a&gt; felt like a breath of fresh air.  I disagree with his use of &lt;a href="http://en.wikipedia.org/wiki/Asynchrony"&gt;asynchronous&lt;/a&gt;, which implies parallel, non-blocking, etc.  But, starting with the mental model is definitely the right approach.  Now, if only this documentation could be integrated with the main documentation...
&lt;/p&gt;
&lt;p&gt;
P.S. Dave Peticolas---I've heard that name before.  Sure enough, he worked on &lt;a href="http://gnucash.org/"&gt;GnuCash&lt;/a&gt;, my accounting program of choice.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-2108149138072281831?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/2108149138072281831/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2011/01/twisted-documentation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/2108149138072281831'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/2108149138072281831'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2011/01/twisted-documentation.html' title='Twisted Documentation'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-7717493060166535778</id><published>2011-01-12T16:01:00.001-05:00</published><updated>2011-01-12T21:47:16.648-05:00</updated><title type='text'>Twisted: callWhenRunning, callFromThread or callLater?</title><content type='html'>&lt;p&gt;
When I first learned of &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.base.ReactorBase.html#callWhenRunning"&gt;&lt;code&gt;reactor.callWhenRunning&lt;/code&gt;&lt;/a&gt;, I apparently didn't read the documentation and/or source code sufficiently carefully.  I correctly understood that it was the function to use when you wanted to queue a function to be called immediately after reactor start.  My mistake was to believe that it queued the function if the reactor had already been started.  In fact, if the reactor is in the "running" state, it simply calls the specified function.  I wonder if part of the reason for this design is how it handles the not-running case.  If the reactor is not running, &lt;code&gt;callWhenRunning&lt;/code&gt; adds a startup trigger for the specified function.  Such a trigger cannot be used to queue-up a task/call.
&lt;/p&gt;
&lt;p&gt;
I learned (the hard way) of the need for &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IReactorThreads.html#callFromThread"&gt;&lt;code&gt;callFromThread&lt;/code&gt;&lt;/a&gt; when trying to run a web server and twisted reactor in separate threads of the same process ("don't try this at home").  Jean-Paul's &lt;a href="http://twistedmatrix.com/pipermail/twisted-python/2011-January/023314.html"&gt;answer to my question about &lt;code&gt;reactor.wakeUp&lt;/code&gt;&lt;/a&gt; provides the reason for this requirement.  The reactor must make blocking calls (e.g. &lt;code&gt;select()&lt;/code&gt;) for certain functionality (e.g. networking).  The &lt;code&gt;wakeUp&lt;/code&gt; trips the blocking call by, e.g., "writ[ing] a byte to a pipe the reactor is select()ing (etc) on".  In my case, I found that an attempt by the web server code to write to the network might be ignored indefinitely unless the call was wrapped with &lt;code&gt;callFromThread&lt;/code&gt;.  What does &lt;code&gt;callFromThread&lt;/code&gt; do?  It adds the function to the &lt;a href="http://twistedmatrix.com/trac/browser/tags/releases/twisted-10.2.0/twisted/internet/base.py#L751"&gt;&lt;code&gt;threadCallQueue&lt;/code&gt;&lt;/a&gt; and &lt;a href="http://twistedmatrix.com/trac/browser/tags/releases/twisted-10.2.0/twisted/internet/iocpreactor/reactor.py#L133"&gt;"wakes up"&lt;/a&gt; the reactor.  Unlike &lt;code&gt;callWhenRunning&lt;/code&gt; the specified function call isn't made until after &lt;code&gt;callFromThread&lt;/code&gt; returns, so it can be used to queue-up a function for running when the reactor (re-)gains control.
&lt;/p&gt;
&lt;p&gt;
If you read the &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IReactorThreads.html#callFromThread"&gt;&lt;code&gt;callFromThread&lt;/code&gt;&lt;/a&gt; documentation, you'll find that &lt;a href="http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IReactorTime.html#callLater"&gt;&lt;code&gt;callLater&lt;/code&gt;&lt;/a&gt; is the recommended way (with delay=0) to queue a function for calling in the next &lt;code&gt;mainLoop&lt;/code&gt; iteration.  Like &lt;code&gt;callFromThread&lt;/code&gt;, &lt;code&gt;callLater&lt;/code&gt; uses a queue(s) to manage the calls.  Two queues are kept: one for calls which haven't waited long enough (&lt;code&gt;_newTimedCalls&lt;/code&gt;), and one for calls which have waited long enough, but haven't been called yet (&lt;code&gt;_pendingTimedCalls&lt;/code&gt;).  The &lt;code&gt;_pendingTimedCalls&lt;/code&gt; are called during the next &lt;a href="http://twistedmatrix.com/trac/browser/tags/releases/twisted-10.2.0/twisted/internet/base.py#L1161"&gt;&lt;code&gt;mainLoop&lt;/code&gt;&lt;/a&gt; iteration. 
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-7717493060166535778?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/7717493060166535778/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2011/01/twisted-callwhenrunning-callfromthread.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7717493060166535778'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7717493060166535778'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2011/01/twisted-callwhenrunning-callfromthread.html' title='Twisted: callWhenRunning, callFromThread or callLater?'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-1017247706279741524</id><published>2010-12-01T15:21:00.005-05:00</published><updated>2010-12-01T17:26:08.815-05:00</updated><title type='text'>Half-closing a TCP connection in Twisted</title><content type='html'>&lt;p&gt;
&lt;code&gt;loseWriteConnection&lt;/code&gt; is the function I had been looking for all day.  In retrospect, it was obvious---just look at the &lt;a href="http://twistedmatrix.com/documents/10.2.0/api/twisted.internet.interfaces.ITCPTransport.html"&gt;ITCPTransport&lt;/a&gt; manual page.  But, at first I didn't know what I was looking for---I was just confused as to why &lt;a href="http://en.wikipedia.org/wiki/Netcat"&gt;netcat&lt;/a&gt; wasn't working as expected.
&lt;/p&gt;
&lt;p&gt;
I was trying to get server status information which required sending a simple command to the server.  When I used a custom netcat-like utility, it worked, but when I used netcat or python/&lt;a href="http://twistedmatrix.com"&gt;twisted&lt;/a&gt;, it didn't.  At first, I thought the special utility might have been sending an extra EOF-like character, but some testing eliminated that possibility.  Then, I thought it might be a feed-line issue.  Nope.  Finally, I realized the problem---netcat and python/twisted weren't half-closing the write connection after sending the command.  How did I come to this conclusion?  I tried the netcat &lt;code&gt;-q&lt;/code&gt; option and immediately got back the server status information (before the specified timeout).
&lt;/p&gt;
&lt;p&gt;
Earlier, I had &lt;i&gt;tried&lt;/i&gt; to (half-)close the connection with python/twisted using &lt;code&gt;ITransport.loseConnection&lt;/code&gt;.  But, after fully realizing the half-close issue and making additional &lt;code&gt;loseConnection&lt;/code&gt; attempts, I concluded that &lt;code&gt;loseConnection&lt;/code&gt; fully closes the connection, losing the response.  Next, I found &lt;a href="http://twistedmatrix.com/documents/10.1.0/api/twisted.internet.tcp.Connection.html#_closeWriteConnection"&gt;_closeWriteConnection&lt;/a&gt; which &lt;i&gt;sounded&lt;/i&gt; like it would do exactly what I wanted.  The &lt;a href="http://twistedmatrix.com/trac/browser/tags/releases/twisted-10.2.0/twisted/internet/tcp.py#L484"&gt;source&lt;/a&gt; even looked like it would work, but for whatever reason it didn't.  Finally, I was clued-into &lt;a href="http://twistedmatrix.com/documents/10.2.0/api/twisted.internet.abstract.FileDescriptor.html#loseWriteConnection"&gt;loseWriteConnection&lt;/a&gt; which closed the write-side of the connection while still allowing reading of the server response.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-1017247706279741524?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/1017247706279741524/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/12/losewriteconnection.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1017247706279741524'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1017247706279741524'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/12/losewriteconnection.html' title='Half-closing a TCP connection in Twisted'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-7635953852253384108</id><published>2010-09-24T08:42:00.000-04:00</published><updated>2010-09-24T08:42:51.580-04:00</updated><title type='text'>Running Tests</title><content type='html'>&lt;p&gt;
For a python project I worked on, we used the standard python &lt;a href="http://docs.python.org/library/unittest.html"&gt;unittest&lt;/a&gt; module and placed test classes within an &lt;code&gt;if __name__=='__main__':&lt;/code&gt; block at the bottom of each module.  This makes tests easy to run and has the advantage of keeping the testing code close to the source code.  But, as I've learned, there's a better way to do it.
&lt;/p&gt;
&lt;p&gt;
The major drawback of the above framework is a lack of control over tests.  One cannot selectively run tests from within a module nor can test results be compiled in a nice way (since screen-scraping is the only option).  I've since learned about &lt;a href="http://code.google.com/p/python-nose/"&gt;nose&lt;/a&gt;, which is a "test runner."  Instead of wrapping unit test classes in a &lt;code&gt;if __name__=='__main__':&lt;/code&gt;, you simply place test classes somewhere in your source code hierarchy.  Options include along-side the module code, or in "test" files within a "test" directory.  To run tests, you simply run &lt;code&gt;nosetests&lt;/code&gt; with arguments specifying what tests you want to run.  This could be the root directory of your source code tree, or a list of python module names.  Further refinement of which tests to run can be had by using &lt;a href="http://somethingaboutorange.com/mrl/projects/nose/0.11.2/plugins/attrib.html"&gt;nose attributes&lt;/a&gt;.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-7635953852253384108?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/7635953852253384108/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/09/running-tests.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7635953852253384108'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7635953852253384108'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/09/running-tests.html' title='Running Tests'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-2320792922042798006</id><published>2010-07-13T08:17:00.008-04:00</published><updated>2010-09-23T08:29:13.821-04:00</updated><title type='text'>An absolutely relative import</title><content type='html'>&lt;p&gt;Part of the "What's New" documentation for python 2.5 describes &lt;a href="http://docs.python.org/whatsnew/2.5.html#pep-328-absolute-and-relative-imports"&gt;how to make use of absolute imports&lt;/a&gt;.  After reading this, you might find the following example confusing.  I sure was confused after trying it.
&lt;/p&gt;&lt;p&gt;Create &lt;tt&gt;string.py&lt;/tt&gt;:
&lt;code&gt;&lt;pre&gt;
import string
a = 1
&lt;/pre&gt;&lt;/code&gt;
Create &lt;tt&gt;main.py&lt;/tt&gt;:
&lt;code&gt;&lt;pre&gt;
from __future__ import absolute_import
import string
print string.a
&lt;/pre&gt;&lt;/code&gt;
Both scripts should be placed in the same directory.  Run &lt;tt&gt;main.py&lt;/tt&gt;:
&lt;code&gt;&lt;pre&gt;
$ python main.py
&lt;/pre&gt;&lt;/code&gt;
You'll see &lt;tt&gt;main.py&lt;/tt&gt; print "1", the value set by &lt;tt&gt;string.py&lt;/tt&gt;.  A reading of the python documentation might lead you to believe that this behavior is incorrect---it should instead import the standard library string module and raise an &lt;tt&gt;AttributeError&lt;/tt&gt;.  This interpretation is correct except for that, by default, python includes the script directory in the list of "absolute" import paths.  So, the easy fix is to delete this entry which conveniently is always found at the beginning of &lt;tt&gt;sys.path&lt;/tt&gt;.  The revised &lt;tt&gt;main.py&lt;/tt&gt; is:
&lt;code&gt;&lt;pre&gt;
from __future__ import absolute_import
import sys
sys.path = sys.path[1:]
import string
print string.a
&lt;/pre&gt;&lt;/code&gt;
&lt;/p&gt;&lt;p&gt;I appreciate that python has moved to a cleaner import system.  But, leaving the script/current directory in the list of "absolute" import paths seems like a huge oversight.
&lt;/p&gt;&lt;p&gt;What's especially ridiculous about the default behavior is that if you have a module with the same name as a standard library module, import the standard library module, and include unittests at the bottom, the unittests won't work because the import will behave differently depending on whether the module is imported or run as a script.  This is the problem that initially brought me down this path...
&lt;/p&gt;&lt;p&gt;&lt;b&gt;Update 9/23&lt;/b&gt;: After talking with different people about this issue, I've learned that it's easy to think that &lt;code&gt;sys.path.remove('.')&lt;/code&gt; is the right thing to do here.  It's not.  The default local path inserted by python may be a full path or an empty string in which case &lt;code&gt;sys.path.remove('.')&lt;/code&gt; won't fix the problem.  Trying to remove all local directory entries is also incorrect since the user may genuinely want to include the local directory in the search path.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-2320792922042798006?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/2320792922042798006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/07/absolutely-relative-import.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/2320792922042798006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/2320792922042798006'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/07/absolutely-relative-import.html' title='An absolutely relative import'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-7361628484162815826</id><published>2010-07-07T18:37:00.000-04:00</published><updated>2010-07-07T18:37:51.929-04:00</updated><title type='text'>jsonlib</title><content type='html'>&lt;p&gt;For a project I worked on at ITA, we decided to use &lt;a href="http://docs.python.org/library/pickle.html"&gt;pickle&lt;/a&gt; for internal object serialization/communication.  Pickle certainly makes coding simple, but I've occasionally wondered whether we made the best choice.  I found &lt;a href="http://metaoptimize.com/blog/2009/03/22/fast-deserialization-in-python/"&gt;this article comparing deserialization libraries&lt;/a&gt; to be interesting.  It sounds like the two main competing camps are &lt;a href="http://www.json.org/"&gt;json&lt;/a&gt; and &lt;a href="http://code.google.com/p/protobuf/"&gt;Google's protocol buffers&lt;/a&gt;.  It sounds like protocol buffers is slow (in python) because it is pure python and not optimized for speed.  One json library, &lt;a href="http://pypi.python.org/pypi/jsonlib/"&gt;jsonlib&lt;/a&gt; sounds like the right way to go as it provides faster speeds and more compact storage than pickle.&lt;br /&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-7361628484162815826?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/7361628484162815826/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/07/jsonlib.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7361628484162815826'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/7361628484162815826'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/07/jsonlib.html' title='jsonlib'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-8680331214811442919</id><published>2010-06-08T14:08:00.002-04:00</published><updated>2010-06-08T14:09:51.462-04:00</updated><title type='text'>Returning an exit status with Twisted</title><content type='html'>&lt;p&gt;When I had a need for returning an exit status from a Twisted process, my first instinct was to look for a &lt;code&gt;reactor.stop&lt;/code&gt; argument.  In fact, there have been multiple requests for such, e.g. tickets &lt;a href="http://twistedmatrix.com/trac/ticket/718"&gt;#718&lt;/a&gt; and &lt;a href="http://twistedmatrix.com/trac/ticket/2182"&gt;#2182&lt;/a&gt;.  But, then, I realized that &lt;code&gt;reactor.stop&lt;/code&gt; &lt;i&gt;doesn't&lt;/i&gt; stop the reactor, it merely initiates the shutdown process.  The reactor is not shut down until &lt;code&gt;reactor.run&lt;/code&gt; exits.  This realization made it clear what I should do to return a specific exit code---simply add&lt;br /&gt;
&lt;pre&gt;    &lt;code&gt;sys.exit(code)&lt;/code&gt;
&lt;/pre&gt;immediately after &lt;code&gt;reactor.run&lt;/code&gt;.&lt;br /&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-8680331214811442919?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/8680331214811442919/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/06/returning-exit-status-with-twisted.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/8680331214811442919'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/8680331214811442919'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/06/returning-exit-status-with-twisted.html' title='Returning an exit status with Twisted'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-5961058802902437976</id><published>2010-03-22T10:30:00.003-04:00</published><updated>2010-03-22T10:40:25.811-04:00</updated><title type='text'>More ElementTree Annoyances</title><content type='html'>&lt;p&gt;
&lt;ul&gt;
&lt;li&gt; Cannot serialize &lt;code&gt;int&lt;/code&gt;.  I can see the value in not automatically serializing every possible object with a &lt;code&gt;__str__&lt;/code&gt; method.  But, not converting an int?  C'mon!
&lt;li&gt; Cannot serilaize &lt;code&gt;None&lt;/code&gt;.  Wouldn't &lt;code&gt;None&lt;/code&gt; be the perfect value to indicate "don't serialize this attribute"?
&lt;/ul&gt;
I'm generally a fail-fast-and-loudly kind of guy, but I also don't like having to write more code when it's obvious what I mean.  These seem like two cases where I think the tradeoff is in favor of writing less code...
&lt;/p&gt;
&lt;p&gt;
Examples:
&lt;pre&gt;
&gt;&gt;&gt; import xml.etree.ElementTree as et
&gt;&gt;&gt; et.tostring(et.Element('Foo', attrib={ 'a': 1}))
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in &lt;module&gt;
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1009, in tostring
    ElementTree(element).write(file, encoding)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 663, in write
    self._write(file, self._root, encoding, {})
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 698, in _write
    _escape_attrib(v, encoding)))
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 830, in _escape_attrib
    _raise_serialization_error(text)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 777, in _raise_serialization_error
    "cannot serialize %r (type %s)" % (text, type(text).__name__)
TypeError: cannot serialize 1 (type int)
&gt;&gt;&gt; et.tostring(et.Element('Foo', attrib={ 'a': None}))
Traceback (most recent call last):
  File "&lt;stdin&gt;", line 1, in &lt;module&gt;
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 1009, in tostring
    ElementTree(element).write(file, encoding)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 663, in write
    self._write(file, self._root, encoding, {})
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 698, in _write
    _escape_attrib(v, encoding)))
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 830, in _escape_attrib
    _raise_serialization_error(text)
  File "/usr/lib/python2.5/xml/etree/ElementTree.py", line 777, in _raise_serialization_error
    "cannot serialize %r (type %s)" % (text, type(text).__name__)
TypeError: cannot serialize None (type NoneType)
&lt;/pre&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-5961058802902437976?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/5961058802902437976/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/03/more-elementtree-annoyances.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/5961058802902437976'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/5961058802902437976'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/03/more-elementtree-annoyances.html' title='More ElementTree Annoyances'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-1768253777225604925</id><published>2010-02-08T17:10:00.003-05:00</published><updated>2010-02-08T17:14:25.864-05:00</updated><title type='text'>__missing__</title><content type='html'>&lt;p&gt;
According to the &lt;a href="http://docs.python.org/library/stdtypes.html#mapping-types-dict"&gt;python documentation&lt;/a&gt;:
&lt;blockquote&gt;
If a subclass of dict defines a method __missing__(), if the key key is not present, the d[key] operation calls that method with the key key as argument. The d[key] operation then returns or raises whatever is returned or raised by the __missing__(key) call if the key is not present. No other operations or methods invoke __missing__(). If __missing__() is not defined, KeyError is raised. __missing__() must be a method; it cannot be an instance variable. For an example, see collections.defaultdict.
&lt;/blockquote&gt;
This is, at least, incomplete, since &lt;tt&gt;__missing__&lt;/tt&gt; must not only return the default value, but also assign it internally.  This is made clear in the documentation for &lt;tt&gt;collections.defaultdict&lt;/tt&gt;:
&lt;blockquote&gt;
If default_factory is not None, it is called without arguments to provide a default value for the given key, this value is inserted in the dictionary for the key, and returned.
&lt;/blockquote&gt;
Surprisingly, the &lt;tt&gt;__missing__&lt;/tt&gt; method is not mentioned in the &lt;a href="http://docs.python.org/reference/datamodel.html#special-method-names"&gt;special method names&lt;/a&gt; section of the python documentation.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-1768253777225604925?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/1768253777225604925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/02/missing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1768253777225604925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1768253777225604925'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/02/missing.html' title='__missing__'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-9103400279083001302</id><published>2010-02-04T11:05:00.007-05:00</published><updated>2010-02-08T17:15:47.419-05:00</updated><title type='text'>collections.defaultdict</title><content type='html'>&lt;p&gt;
&lt;a href="http://docs.python.org/library/collections.html#collections.defaultdict"&gt;collections.defaultdict&lt;/a&gt; is nice, especially when counting things.  But, &lt;tt&gt;defaultdict&lt;/tt&gt; only lets you use zero-argument constructors.  Pffft!  Fortunately, it's easy to write a &lt;tt&gt;defaultdict&lt;/tt&gt; which passes arguments to the constructor:
&lt;code&gt;
&lt;pre&gt;
class defaultdict2(dict):
    def __init__(self, factory, factArgs=(), dictArgs=()):
        dict.__init__(self, *dictArgs)
        self.factory = factory
        self.factArgs = factArgs
    def __missing__(self, key):
        self[key] = self.factory(*self.factArgs)
        return self[key]
&lt;/pre&gt;
&lt;/code&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Update 2/8/10&lt;/b&gt;: added "return" line to &lt;tt&gt;__missing__&lt;/tt&gt; per discussion in &lt;a href="http://pythonquirks.blogspot.com/2010/02/missing.html"&gt;this post on &lt;tt&gt;__missing__&lt;/tt&gt;&lt;/a&gt;.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-9103400279083001302?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/9103400279083001302/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/02/collectionsdefaultdict.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/9103400279083001302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/9103400279083001302'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/02/collectionsdefaultdict.html' title='collections.defaultdict'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-401745370312279878</id><published>2010-02-03T15:29:00.007-05:00</published><updated>2010-02-04T11:05:17.420-05:00</updated><title type='text'>Kid Template Recompilation</title><content type='html'>&lt;p&gt;
I'm involved in a project which uses the &lt;a href="http://turbogears.org/"&gt;TurboGears&lt;/a&gt; framework for serving web pages.  The templating language we use is &lt;a href="http://www.kid-templating.org/"&gt;Kid&lt;/a&gt;.  Recently, we ran into a problem where web pages did not correspond to the installed templates.  After a bit of detective work, we suspected that TurboGears/Kid was not using the templates, but rather stale, compiled versions of old templates (&lt;tt&gt;.pyc&lt;/tt&gt; files).  Some &lt;a href="http://thread.gmane.org/gmane.comp.python.kid.general/1527"&gt;Kid mailing list discussion&lt;/a&gt; confirmed our suspicions.  The problem is that Kid only recompiles if the mtime of the source (&lt;tt&gt;.kid&lt;/tt&gt;) file is after the mtime of the corresponding compiled (&lt;tt&gt;.pyc&lt;/tt&gt;) file.  In contrast, &lt;a href="http://docs.python.org/tutorial/modules.html#compiled-python-files"&gt;Python recompiles unless the mtime stored in the &lt;tt&gt;.pyc&lt;/tt&gt; file exactly matches the mtime of the source (&lt;tt&gt;.py&lt;/tt&gt;) file&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;My understanding is that, ideally, Python would use a &lt;a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function"&gt;one-way hash&lt;/a&gt; of the source and only use the compiled file if there is an exact match.  The exact mtime comparison is practically nearly as good and much, much faster.  But, the mtime inequality comparison is a poor approximation of the ideal and only works when you can guarantee that (1) the system clock is perfect and never changes timezone (e.g. no switch between EDT and EST), and (2) mtimes are always updated to "now" whenever contents or locations are changed (i.e. even "mv" must affect mtime and &lt;tt&gt;rsync -a&lt;/tt&gt; is right out).  I don't know of any &lt;a href="http://en.wikipedia.org/wiki/Operating_system"&gt;OS&lt;/a&gt; which provides these guarantees.  The good news is that &lt;a href="http://thread.gmane.org/gmane.comp.python.kid.general/1527"&gt;there is no disagreement on the existence of the problem&lt;/a&gt;; so, this is likely to be fixed in a future version of &lt;a href="http://www.kid-templating.org/"&gt;Kid&lt;/a&gt;.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-401745370312279878?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/401745370312279878/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/02/kid-template-recompilation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/401745370312279878'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/401745370312279878'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/02/kid-template-recompilation.html' title='Kid Template Recompilation'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-1799474491464188058</id><published>2010-01-19T13:22:00.002-05:00</published><updated>2010-01-19T13:50:03.666-05:00</updated><title type='text'>numpy.dot</title><content type='html'>&lt;p&gt;
I should have known.  &lt;code&gt;numpy.dot&lt;/code&gt; doesn't work with sparse matrices.  What's worse is that it happily accepts a sparse matrix as an argument and yields some convoluted array of sparse matrices.  What I should be doing is &lt;code&gt;x.dot(y)&lt;/code&gt; where &lt;code&gt;x&lt;/code&gt; is a &lt;code&gt;scipy.sparse.sparse.spmatrix&lt;/code&gt; and &lt;code&gt;y&lt;/code&gt; is a &lt;code&gt;numpy.ndarray&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;
Note that I'm using the &lt;a href="http://www.debian.org/"&gt;Debian&lt;/a&gt; &lt;a href="http://www.debian.org/releases/stable/"&gt;stable&lt;/a&gt; versions of these packages: numpy 1.1.0 and scipy 0.6.0.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-1799474491464188058?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/1799474491464188058/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/01/numpydot.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1799474491464188058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1799474491464188058'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/01/numpydot.html' title='numpy.dot'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-258235906556330977</id><published>2010-01-08T11:51:00.005-05:00</published><updated>2010-01-08T13:29:13.294-05:00</updated><title type='text'>urllib2.HTTPErrorProcessor</title><content type='html'>&lt;p&gt;
With code similar to that I posed in &lt;a href="http://pythonquirks.blogspot.com/2009/12/asynchronous-http-request.html"&gt;Asynchronous HTTP Request&lt;/a&gt;, I was occasionally getting empty responses to my requests.  When I added &lt;a href="http://docs.python.org/library/urllib2.html#httperrorprocessor-objects"&gt;urllib2.HTTPErrorProcessor&lt;/a&gt; to the inheritance list for MyHandler, the problem went away.  My guess is the server was generating a &lt;a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html"&gt;503 Service Unavailable&lt;/a&gt; responses and my client code wasn't handling it.  How one was supposed to know to do this from the documentation, I am unsure.  I'm guessing that if the server might provide a redirect for your url, you would also want to inherit from &lt;a href="http://docs.python.org/library/urllib2.html#urllib2.HTTPRedirectHandler"&gt;urllib2.HTTPRedirectHandler&lt;/a&gt;.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-258235906556330977?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/258235906556330977/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2010/01/urllib2httperrorprocessor.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/258235906556330977'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/258235906556330977'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2010/01/urllib2httperrorprocessor.html' title='urllib2.HTTPErrorProcessor'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-1023125139537202406</id><published>2009-12-28T11:58:00.013-05:00</published><updated>2010-01-08T13:30:11.952-05:00</updated><title type='text'>Element.text and other ElementTree Annoyances</title><content type='html'>&lt;p&gt;I have a love/hate relationship with ElementTree.  It generally makes processing and generating XML very easy.  But, some of the design decisions feel like they were meant to frustrate, rather than help, the programmer:
&lt;ul&gt;
&lt;li&gt;Many &lt;tt&gt;__str__&lt;/tt&gt; and &lt;tt&gt;__repr__&lt;/tt&gt; methods return near-useless strings like &lt;tt&gt;&amp;lt;Element ElementName at 7fb1d0f63e60&amp;gt;&lt;/tt&gt;.  Would methods that specify attributes and text/tail properties really be so difficult to define?  Even "&lt;tt&gt;def __str__(self): return tostring(self)&lt;/tt&gt;" would be an improvement.&lt;/li&gt;
&lt;li&gt;&lt;tt&gt;Element()&lt;/tt&gt; cannot specify &lt;tt&gt;text&lt;/tt&gt;.  The &lt;tt&gt;Element&lt;/tt&gt; factory only lets you specify the tag name and attributes.  There is no argument you can pass to specify the &lt;tt&gt;text&lt;/tt&gt; or &lt;tt&gt;tail&lt;/tt&gt;.  See, for example, &lt;a href="http://aspn.activestate.com/ASPN/Mail/Message/python-list/2876699"&gt;a discussion about setting the text property&lt;/a&gt;.  I see no point in forcing the programmer to write the extra line of code.&lt;/li&gt;
&lt;li&gt;Interfaces.  ElementTree hides the Element class and provides a factory which returns an object which implements the _ElementInterface.  There are other languages in which I can see this being a useful practice.  But, python does not have sufficient language support and I find that this half-hearted attempt at abstraction simply makes the module more difficult to use.  Python already provides &lt;b&gt;plenty&lt;/b&gt; of tools to hide "magic" which don't interrupt programmer intuitions.  Why not use those?&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-1023125139537202406?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/1023125139537202406/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/elementtext.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1023125139537202406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/1023125139537202406'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/elementtext.html' title='Element.text and other ElementTree Annoyances'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-4470948428324873561</id><published>2009-12-23T15:43:00.015-05:00</published><updated>2011-04-02T20:58:29.675-04:00</updated><title type='text'>Asynchronous HTTP Request</title><content type='html'>&lt;p&gt;&lt;i&gt;Note (4/2/11): Please see my recent post detailing &lt;a href="http://pythonquirks.blogspot.com/2011/04/twisted-asynchronous-http-request.html"&gt;asynchronous HTTP requests using Twisted&lt;/a&gt;.&lt;/i&gt;&lt;/p&gt;

&lt;p&gt;&lt;i&gt;Note (3/13/11): I originally wrote this post while looking for callback-style HTTP request functionality in python.  I made the mistake of thinking that "callback-style" is the same as "asynchronous".  The following details my efforts to achieve a callback-style HTTP request using &lt;a href="http://docs.python.org/library/urllib2.html"&gt;urllib2&lt;/a&gt;.  The final (updated) code example illustrates how to use threads to achieve asynchronicity.  I'd recommend using a &lt;a href="http://pypi.python.org/pypi/threadpool"&gt;thread pool&lt;/a&gt; if you plan more than just a handful of requests.  And, as others have noted, &lt;a href="http://twistedmatrix.com/"&gt;Twisted&lt;/a&gt; is really the best python framework for asynchronous programming.  Also, I'd like to thank the commenters for pointing out my mistakes; I'm sorry for not realizing my errors sooner.&lt;/i&gt;&lt;/p&gt;

&lt;p&gt;You might think it would be easy to write python code to &lt;s&gt;perform an asynchronous&lt;/s&gt; achieve a callback-style web request.  It ought to be as simple as providing a url and callback function to some python library routine, no?  Well, technically, it is that simple.  But somehow, the documentation makes the task surprisingly difficult.&lt;/p&gt;

&lt;p&gt;One option, of course, is &lt;a href="http://twistedmatrix.com/"&gt;Twisted&lt;/a&gt;.  But, reading through the (sparse, fractured) documentation made me think there had to be something easier.  This led me to &lt;a href="http://docs.python.org/library/urllib2.html"&gt;urllib2&lt;/a&gt;.  The short answer is that, yes, urllib2 does what I want.  But, the documentation is sufficiently backwards that it took me over an hour to figure out how to accomplish the task.&lt;/p&gt;

&lt;p&gt;Accomplishing a &lt;s&gt;blocking&lt;/s&gt; simple HTTP request with urllib2 is simple and the documentation reflects that: use &lt;tt&gt;openurl&lt;/tt&gt;.  The return value of &lt;tt&gt;openurl&lt;/tt&gt; provides the response and additional information in a file-like object.  The problem is how to achieve the same result in an &lt;s&gt;asynchronous&lt;/s&gt; callback-style manner.  One would think &lt;tt&gt;openurl&lt;/tt&gt; could simply take an additional &lt;b&gt;handler&lt;/b&gt; object which is called with the response as its only argument when the request completes.  Ha!  &lt;tt&gt;build_opener&lt;/tt&gt; looked vaguely promising as it accepted handler(s).  This led me to create a class which inherited from &lt;tt&gt;BaseHandler&lt;/tt&gt; which defined &lt;tt&gt;protocol_response&lt;/tt&gt;.  No dice.  And, as I later realized, &lt;tt&gt;protocol_response&lt;/tt&gt; takes three arguments (self, req, response), not two, and changes names depending on the &lt;tt&gt;protocol&lt;/tt&gt;.  Of course, at that point, I was at a loss as to how the &lt;tt&gt;protocol&lt;/tt&gt; name was determined (the &lt;tt&gt;BaseHandler&lt;/tt&gt; documentation ignored this issue).  And, the examples were useless since they all used standard handlers.  Next, I tried inheriting from &lt;tt&gt;HTTPHandler&lt;/tt&gt;, overriding &lt;tt&gt;http_response&lt;/tt&gt; with a method that simply prints the url, info and response text.  This &lt;i&gt;almost&lt;/i&gt; worked.  It successfully retrieved the web page and printed it.  But, then, it raised the following exception:
&lt;pre&gt;
Traceback (most recent call last):
  File "./webtest.py", line 14, in &lt;module&gt;
    o.open('http://www.google.com/')
  File "/usr/lib/python2.6/urllib2.py", line 389, in open
    response = meth(req, response)
  File "/usr/lib/python2.6/urllib2.py", line 496, in http_response
    code, msg, hdrs = response.code, response.msg, response.info()
AttributeError: 'NoneType' object has no attribute 'code'
&lt;/pre&gt;
After much searching, I finally realized that I had failed to return a response-like object from my &lt;tt&gt;http_response&lt;/tt&gt; method.  This seems like an odd requirement for a callback method.  And, it could have been easily clarified in the documentation with an example.
&lt;/p&gt;
&lt;p&gt;
Alas, after all that, I was able to use &lt;tt&gt;urllib2&lt;/tt&gt; to successfully make an asynchronous HTTP request, so I can't complain too much.  Here's the code for anyone who's interested:
&lt;pre&gt;
#!/usr/bin/env python

import urllib2
import threading

class MyHandler(urllib2.HTTPHandler):
    def http_response(self, req, response):
        print "url: %s" % (response.geturl(),)
        print "info: %s" % (response.info(),)
        for l in response:
            print l
        return response

o = urllib2.build_opener(MyHandler())
t = threading.Thread(target=o.open, args=('http://www.google.com/',))
t.start()
print "I'm asynchronous!"
&lt;/pre&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Update (3/12/11)&lt;/b&gt;: My comment before the sample code indicated that the sample code was asynchronous.  But, it wasn't.  I've updated it to be asynchronous.  When originally writing this post, I intended the example code to show the urllib2 handler approach.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-4470948428324873561?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/4470948428324873561/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/asynchronous-http-request.html#comment-form' title='11 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/4470948428324873561'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/4470948428324873561'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/asynchronous-http-request.html' title='Asynchronous HTTP Request'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>11</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-6252656026715194467</id><published>2009-12-17T10:00:00.002-05:00</published><updated>2009-12-17T10:11:40.892-05:00</updated><title type='text'>Reworking the GIL</title><content type='html'>The title of this post is stolen from &lt;a href="http://mail.python.org/pipermail/python-dev/2009-October/093321.html"&gt;an email which describes steps the author has made to address GIL issues&lt;/a&gt; &lt;a href="http://www.dabeaz.com/"&gt;David Beazley&lt;/a&gt; raised with his &lt;a href="http://pythonquirks.blogspot.com/2009/10/global-interpreter-lock.html"&gt;talk on the Global Interpreter Lock&lt;/a&gt;.  The proposed changes certainly won't turn Python into a completely thread-friendly language (the GIL is not going away any time soon), but it sounds like these changes will greatly reduce thread overhead and give the effect of running on a single-core machine that one would expect with a global interpreter lock.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-6252656026715194467?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/6252656026715194467/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/reworking-gil.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/6252656026715194467'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/6252656026715194467'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/reworking-gil.html' title='Reworking the GIL'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-5094901151977733134</id><published>2009-12-03T12:50:00.003-05:00</published><updated>2009-12-03T13:21:59.114-05:00</updated><title type='text'>Twisted Annoyances</title><content type='html'>While Twisted is generally an excellent network library, it certainly has its quirks.
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Exception Trapping&lt;/b&gt;: by default, the reactor will trap &lt;i&gt;all&lt;/i&gt; exceptions.  See &lt;a href="http://twistedmatrix.com/trac/browser/tags/releases/twisted-9.0.0/twisted/internet/base.py#L741"&gt;ReactorBase.runUntilCurrent&lt;/a&gt; for the code that implements this horrible behavior.  This breaks intuitions most developers have for how exceptions are supposed to work.  For example, the &lt;a href="http://docs.python.org/tutorial/errors.html#handling-exceptions"&gt;python tutorial&lt;/a&gt; says:
&lt;blockquote&gt;
The last except clause may omit the exception name(s), to serve as a wildcard. Use this with extreme caution, since it is easy to mask a real programming error in this way!
&lt;/blockquote&gt;
Twisted does an excellent job of masking real programming errors.  I'm surprised that  &lt;a href="http://twistedmatrix.com/pipermail/twisted-python/2008-June/017844.html"&gt;there is no option to turn off this behavior&lt;/a&gt;.
&lt;li&gt;&lt;b&gt;&lt;a href="http://twistedmatrix.com/trac/browser/tags/releases/twisted-9.0.0/twisted/internet/base.py#L959"&gt;callFromThread&lt;/a&gt;&lt;/b&gt;: calling a reactor method (e.g. protocol.writeData) from a non-reactor thread doesn't work without wrapping it with this method.&lt;/li&gt;
&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-5094901151977733134?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/5094901151977733134/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/twisted-annoyances.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/5094901151977733134'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/5094901151977733134'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/12/twisted-annoyances.html' title='Twisted Annoyances'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-4552050872674050650</id><published>2009-11-17T18:51:00.005-05:00</published><updated>2009-11-18T07:03:16.460-05:00</updated><title type='text'>Python Coil</title><content type='html'>&lt;p&gt;
&lt;a href="http://mike.marineau.org/coil"&gt;Coil&lt;/a&gt; is a nice configuration language for &lt;a href="http://www.python.org/"&gt;Python&lt;/a&gt; &lt;s&gt;created by &lt;a href="http://mike.marineau.org/"&gt;Michael Marineau&lt;/a&gt;&lt;/s&gt; created by &lt;a href="http://itamarst.org/"&gt;Itamar Turner-Trauring&lt;/a&gt; and currently maintained by &lt;a href="http://mike.marineau.org/"&gt;Michael Marineau&lt;/a&gt;.  It is used here at &lt;a href="http://itasoftware.com/"&gt;ITA&lt;/a&gt;.  It is &lt;b&gt;much&lt;/b&gt; less verbose than XML but is very readable and minimizes duplication.  I really wish there were a &lt;a href="http://www.debian.org/distrib/packages"&gt;Debian package&lt;/a&gt; for it!
&lt;/p&gt;

&lt;p&gt;Update (11/18/09): Michael M. informed me of the correct history :)
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-4552050872674050650?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/4552050872674050650/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/11/python-coil.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/4552050872674050650'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/4552050872674050650'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/11/python-coil.html' title='Python Coil'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-4287235288880003413</id><published>2009-10-28T13:46:00.007-04:00</published><updated>2009-10-28T14:08:53.330-04:00</updated><title type='text'>Block Scoping</title><content type='html'>&lt;p&gt;
Unlike languages like Java, C++ and Perl, Python does not have block scoping.  I.e. if you define a variable inside a loop in Python, it will still be in scope after that loop and will override previous bindings to that name.  Python instead delimits scope at the levels of module, class and function.  I defer to the &lt;a href="http://docs.python.org/reference/executionmodel.html"&gt;authoritative source&lt;/a&gt; for the gory details.  Note that according to the definitions used therein, Python does have block-level scoping, but that is only if you define block delimiters to be modules, classes and functions :-)
&lt;/p&gt;

&lt;p&gt;
Python scoping is a drawback in the sense that if you are used to (Java/C++/Perl) block scoping, you are likely to accidentally introduce bugs as a result of using the same variable at different block levels.  I've introduced a few such bugs.  On the other hand, Python scoping eliminates the need to pre-define/declare variables which are set in a loop/if block, yet you need access to afterward.  So, even though I've been burned by this style of scoping, I find that it helps me write better (i.e. more readable) code.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-4287235288880003413?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/4287235288880003413/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/10/block-scoping.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/4287235288880003413'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/4287235288880003413'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/10/block-scoping.html' title='Block Scoping'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-5132007223871397903</id><published>2009-10-27T15:12:00.003-04:00</published><updated>2009-12-23T17:00:28.693-05:00</updated><title type='text'>The Global Interpreter Lock</title><content type='html'>&lt;p&gt;Python technically has threading capabilities.  And, it can work quite well if the threads are i/o-bound.  However, Python threading doesn't work so well when threads are cpu-bound.  The following hour-long video explains why.  Read the &lt;a href="http://www.dabeaz.com/python/GIL.pdf"&gt;slides&lt;/a&gt; if you are impatient.
&lt;blockquote&gt;
&lt;a href="http://blip.tv/file/2232410"&gt;http://blip.tv/file/2232410&lt;/a&gt;
&lt;/blockquote&gt;
&lt;/p&gt;
&lt;p&gt;
One observation that David Beazley makes is that only the "main" thread can deal with signals like Control-C.  However, if this thread is blocked via a join(), the signal will not get handled.  So, it may be worth creating a thread separate from the "main" thread to spawn and join threads.  Haven't yet tested this, though...
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-5132007223871397903?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/5132007223871397903/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/10/global-interpreter-lock.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/5132007223871397903'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/5132007223871397903'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/10/global-interpreter-lock.html' title='The Global Interpreter Lock'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5663931432891696372.post-8165640859195480998</id><published>2009-10-19T17:13:00.000-04:00</published><updated>2009-10-19T17:54:02.493-04:00</updated><title type='text'>Controlling Printing in Numpy</title><content type='html'>Numpy has &lt;a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html"&gt;numpy.set_printoptions&lt;/a&gt; for controlling the printing of arrays.  See the doc for full details.  Now that I know about it, I'll be using something displaying fewer precision digits, allowing a larger linewidth and not summarizing until the array is substantially larger:&lt;pre&gt;
numpy.set_printoptions(precision=4,
                       threshold=10000,
                       linewidth=150)&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5663931432891696372-8165640859195480998?l=pythonquirks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pythonquirks.blogspot.com/feeds/8165640859195480998/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pythonquirks.blogspot.com/2009/10/controlling-printing-in-numpy.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/8165640859195480998'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5663931432891696372/posts/default/8165640859195480998'/><link rel='alternate' type='text/html' href='http://pythonquirks.blogspot.com/2009/10/controlling-printing-in-numpy.html' title='Controlling Printing in Numpy'/><author><name>Jason</name><uri>http://www.blogger.com/profile/00489496856755184870</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='27' height='32' src='http://4.bp.blogspot.com/-ICrk1TFfN4s/TimsDKoczDI/AAAAAAAAAcM/6Au4EVw5h1M/s220/jrennie-defense-crop.jpg'/></author><thr:total>0</thr:total></entry></feed>
