Saturday, June 06, 2009

Is Python's select.poll unreliable?

I've been building a DHT using Stackless Python, and a nonblocking IO layer which provides IO event notification using epoll, poll, or select, depending on which module is available on the operating system.

I'm developing on OSX, so my module cannot use epoll, and downgrades to the regular select.poll object. This is the first time I've used a select.poll object instead of select.

I'm coming up against some nasty problems. Every now and then, select.poll does weird things. Eg, returning a file no which I have not registered, which might be 1 or some random number. Sometimes it even returns negative numbers. Recently, in some circumstances it does not return a write event, when clearly it should be. And of course, these bugs are intermittent and hard to reproduce.

So... I switched the IO layer to use select.select instead of select.poll. Voila, everything works perfectly.

Is select.poll known to be buggy? It's hard to find lots of example python code which uses it, so I wonder if it is very well tested across a range of platforms.

4 comments:

Anonymous said...

I think poll() is broken on osx. See: http://www.cherrypy.org/ticket/598 for some start points

Paul said...

I found that socket pairs didn't work properly with poll on OpenSolaris whereas pipes did. Perhaps the proprietary UNIX heritage brings with it issues like this.

Simon Wittber said...

Right, thanks.

It does seem that poll is broken on OSX. How did that get by Apple QA?!

Ryan Phillips said...

It's because Poll is based on kqueue on OSX, and kqueue is broken on OSX.

http://openradar.appspot.com/6444043

Popular Posts