Watch out for list(dict.keys()) in Python 3

As everyone is probably aware by now, in Python 3 dict.keys(), dict.values() and dict.items() will all return iterable views instead of lists. The standard way being suggested to overcome the difference, when the original behavior was actually intended, is to simply use list(dict.keys()). This should be usually fine, but not in all cases.

One of the reasons why someone might actually opt to perform a more expensive copying operation is because, with the pre-3.0 semantics, the keys() method is atomic, in the sense that the whole operation of converting all dictionary keys to a list is done while the global interpreter lock is held. Thus, it’s thread-safe to run dict.keys() with Python 2.X.

The suggested replacement in Python 3, list(dict.keys()), is not. There’s a chance that the interpreter will give another thread a chance to run before or during the iteration of the view, and this will cause an exception if the dictionary is modified at the same time. To fix the problem, either a lock must protect the iteration, or a more expensive operation such as dict.copy().keys() must be used.

The 2to3 tool won’t help you there, unfortunately. So, keep an eye on it!

MagLev and distributed VMs

Avi Bryant is working on MagLev, a Ruby interpreter, based on Gemstone’s Smalltalk VM, with some very amazing features, like transactioned objects distributed across several VMs:

The integrated VMs, cache, and storage conspire to create an illusion that global state is shared across all instances: no matter how many VMs you add, over however many machines, they all see and work with the same set of Ruby objects.

My geek side finds this highly exciting, and eager to see it released to see how people will deal with it in practice.

At the same time, my let’s-build-stable-and-maintainable-software side is a bit skeptic. As Joe Armstrong puts so enthusiastically, shared state is hard to manage correctly, global transactions reduce scalability, and transparent RPC is seductive, but dangerous.

I’m also curious about the speed gains pointed out. It’s well known that the Ruby VM isn’t very fast, which means that there must be opportunities for speedups. Even then, 100x faster is impressive, and history shows that sometimes the significant improvements are harder when the semantics are precisely the same. Let’s hope Avi can manage to run the Ruby tests successfully.

Improving reading habits

Today, Sunday, on the mailman day, I decided to change my reading habits.

You’d certainly laugh if I told you how many mailing lists, blogs, and IRC channels I try to follow (won’t include IM networks here as I don’t really read them asynchronously). What I look for is pretty obvious: I want to exchange volume for quality.

The first thing I’m doing is unsubscribing from all high-traffic lists I’m part of. The reason is clear: one hundred messages a day can’t possibly be all interesting. I’m not saying there are no interesting posts among these, of course. But with such a vibrant community of followers, a few smart readers usually bring up the most interesting discussions in more selective formats. I’ll track these instead.

For the same reason, I’m unsubscribing from most feed aggregators. Planet and similars are a great way to subscribe to many feeds quickly, but let’s face it.. how many posts in an aggregator with lots of feeds are interesting to a single individual? While getting off from them, I’m selectively peaking the feeds that interest me and subscribing to each.

Then, for the not-so-high volume sources, I’m checking the last 5 posts or so (or days, for IRC channels). Anything that hasn’t had information worth tracking will be phased out too. Interesting topics eventually will find their way to the sources I’ll still follow.

I want to read less, to read more. I want to go faster through the queue of pending books, and also follow a wider variety of topics with less pain.