Archive for the 'Project' Category

The last 4 years (and the next N?)

Some interesting changes have been happening in my professional life, so I wanted to share it here to update friends and also for me to keep track of things over time (at some point I will be older and will certainly laugh at what I called “interesting changes” in the ol’days). Given the goal, I apologize but this may come across as more egocentric than usual, so please feel free to jump over to your next blog post at any time.

It’s been little more than four years since I left Conectiva / Mandriva and joined Canonical, in August of 2005. Shortly after I joined, I had the luck of spending a few months working on the different projects which the company was pushing at the time, including Launchpad, then Bazaar, then a little bit on some projects which didn’t end up seeing much light. It was a great experience by itself, since all of these projects were abundant in talent. Following that, in the beginning of 2006, counting on the trust of people which knew more than I did, I was requested/allowed to lead the development of a brand new project the company wanted to attempt. After a few months of research I had the chance to sit next to Chris Armstrong and Jamu Kakar to bootstrap the development of what is now known as the Landscape distributed systems management project.

Fast forward three and a half years, in mid 2009, and Landscape became a massive project with hundreds of thousands of very well tested lines, sprawling not only a client branch, but also external child projects such as the Storm Object Relational Mapper, in use also by Launchpad and Ubuntu One. In the commercial side of things it looks like Landscape’s life is just starting, with its hosted and standalone versions getting more and more attention from enterprise customers. And the three guys which started the project didn’t do it alone, for sure. The toy project of early 2006 has grown to become a well structured team, with added talent spreading areas such as development, business and QA.

While I wasn’t watching, though, something happened. Facing that great action, my attention was slowly being spread thinly among management, architecture, development, testing, code reviews, meetings, and other tasks, sometimes in areas not entirely related, but very interesting of course. The net result of increased attention sprawl isn’t actually good, though. If it persists, even when the several small tasks may be individually significant, the achievement just doesn’t feel significant given the invested effort as a whole. At least not for someone that truly enjoys being a software architect, and loves to feel that the effort invested in the growth of a significant working software is really helping people out in the same magnitude of that investment. In simpler words, it felt like my position within the team just wasn’t helping the team out the same way it did before, and thus it was time for a change.

Last July an external factor helped to catapult that change. Eucalyptus needed a feature to be released with Ubuntu 9.10, due in October, to greatly simplify the installation of some standard machine images.. an Image Store. It felt like a very tight schedule, even more considering that I hadn’t been doing Java for a while, and Eucalyptus uses some sexy (and useful) new technology called the Google Web Toolkit, something I had to get acquainted with. Two months looked like a tight schedule, and a risky bet overall, but it also felt like a great opportunity to strongly refocus on a task that needed someone’s attention urgently. Again I was blessed with trust I’m thankful for, and by now I’m relieved to look back and perceive that it went alright, certainly thanks to the help of other people like Sidnei da Silva and Mathias Gug. Meanwhile, on the Landscape side, my responsibilities were distributed within the team so that I could be fully engaged on the problem.

Moving this forward a little bit we reach the current date. Right now the Landscape project has a new organizational structure, and it actually feels like it’s moving along quite well. Besides the internal changes, a major organizational change also took place around Landscape over that period, and the planned restructuring led me to my current role. In practice, I’m now engaging into the research of a new concept which I’m hoping to publish openly quite soon, if everything goes well. It’s challenging, it’s exciting, and most importantly, allows me to focus strongly on something which has a great potential (I will stop teasing you now). In addition to this, I’ll definitely be spending some of that time on the progress of Landscape and the Image Store, but mostly from an architectural point of view, since both of these projects will have bright hands taking care of them more closely.

Sit by the fireside if you’re interested in the upcoming chapters of that story. ;-)

Geocaching on the Easter Island

This post is not about what you think it is, unfortunately. I actually do hope to go to the Easter Island at some point, but this post is about a short story which involves geohash.org, Groundspeak (from geocaching.com), and very very poor minded behavior.

The context

So, before anything else, it’s important to understand what geohash.org is. As announced when the service was launched (also as a post on Groundspeak’s own forum), geohash.org offers short URLs which encode a latitude/longitude pair, so that referencing them in emails, forums, and websites is more convenient, and that’s pretty much it.

When people go to geohash.org, they can enter geographic coordinates that they want to encode, and they get back a nice little map with the location, some links to useful services, and most importantly the actual Geohash they can use to link to the location, so as an example they could be redirected to the URL http://geohash.org/6gkzwgjf3.

Of course, it’s pretty boring to be copy & pasting coordinates around, so shortly after the service launched, the support for geocoding addresses was also announced, which means people could type a human oriented address and get back the Geohash page for it. Phew.. much more practical.

The problem

All was going well, until a couple of months ago, when a user reported that the geocoding of addresses wasn’t working anymore. After some investigation, it turned out that geohash.org was indeed going over the free daily quota allowed by the geocoding provider used. But, that didn’t quite fit with the overall usage reports for the system, so I went on to investigate what was up in the logs.

The cause

Something was wrong indeed. The system was getting thousands of queries a day from some application, and not only that, but the queries were entirely unrelated to Geohashes. The application was purely interested in the geocoding of addresses which the site supported for the benefit of Geohash users. Alright, that wasn’t something nice to do, but I took it lightly since the interface implemented could perhaps give the impression that the site was a traditional geocoding system. So, to fix the situation, the non-Geohash API was removed at this point, and requests for the old API then started to get an error saying something like 403 Forbidden: For geocoding without geohashes, please look elsewhere..

Unfortunately, that wasn’t the end of the issue. Last week I went on to see the logs, and the damn application was back, and this time it was using Geohashes, so I became curious about who was doing that. Could I be mistakingly screwing up some real user of Geohashes? So, based on the logs, I went on to search for who could possibly be using the system in such a way. It wasn’t too long until I found out that, to my surprise, it was Groundspeak’s iPhone application. Groundspeak’s paid iPhone application, to be more precise, because the address searching feature is only available for paying users.

Looking at the release notes for the application, there was no doubt. Version 2.3.1, sent to Apple on September 10th, shortly after the old API was blocked, fixes the Search by Address/Postal Code feature says the maintainer, and there’s even a thread discussing the breakage where the maintainer mentions:

The geocoding service we’ve been using just turned their service off. That’s why things are failing; it was relying on an external service for this feature. We’re fixing the issue on our end and using a service that shouldn’t fail as easily. Unfortunately we’ll have to do an update to the store to get this feature out to the users. This will take some time, but in version 2.4 this will work.

Wait, ok, so let’s see this again. First, they were indeed not using Geohashes at all, and instead using geohash.org purely as a geocoding service. Then, when the API they used is disabled with hints that the Geohash service is not a pure geocoding service, they workaround this by decoding the Geohash retrieved and grabbing the coordinates so that they can still use it as a pure geocoding service. At the same time, they tell their users that they changed to “a service that shouldn’t fail as easily”. Under no circumstances they contact someone at geohash.org to see what was going on (shouldn’t be necessary, really, but assuming immaculate innocence, sending an email would be pretty cool).

Redirecting users to the Easter Island

So, yeah, sorry, but I didn’t see many reasons to sustain the situation. Not only because it looks like an unfriendly behavior overall, but also because, on their way of using an unrelated free service to sustain their paid application, they were killing the free geocoding feature of geohash.org with thousands of geocoding requests a day, which impacted on the daily quota the service has by itself.

So, what to do? I could just disable the service again, or maybe contact the maintainers and ask them to please stop using the service in such a way, after all there are dozens of real geocoding services out there! But… hmmm… I figured a friendly poke could be nice at this point, before actually bringing up that whole situation.

And that’s what happened: rather than blocking their client, the service was modified so that all of their geocoding requests translated into the geographic coordinates of the Easter Island.

Of course, users quickly noticed it and started reporting the problem again.

The answer from Groundspeak

After users started complaining loudly, Bryan Roth, which signs as co-founder of Groundspeak, finally contacted me for the first time asking if there was a way to keep the service alive. Unfortunately, I really can’t, and provided the whole explanation to Bryan, and even mentioned that I actually use Google as the upstream geocoding provider and that I would be breaking the terms of service doing this, but offered to redirect their requests to their own servers if necessary.

Their answer to this? Pretty bad I must say. I got nothing via email, but they posted this in the forum:

But seriously, this bug actually has nothing to do with our app and everything to do with the external service we’ve been using to convert an address into GPS coordinates. For the next app update, we’re completely dropping that provider since they’ve now failed us twice. We’ll be using only Google from that point on, so hopefully their data will be more accurate.

I can barely believe what I read. They blame the upstream service, as if they were using a first class geocoding provider somewhere rather than sucking resources from a site they felt cool to link their paid application to, take my suggestion of using Google for geocoding, and lie about the fact that the data would be more accurate (it obviously can’t, since it was already Google that was being used).

I mentioned something about this in the forum itself, but I was moderated out immediately of course.

Way to go Groundspeak.

UPDATE

After some back and forth with Bryan and Josh, the last post got edited away to avoid the misleading details, and Bryan clarified the case in the forum. Then, we actually settled on my proposal of redirecting the iPhone Geocaching.com application requests to Groundspeak’s own servers so that users of previous versions of the application wouldn’t miss the feature while they work on the new release.

If such communication had taken place way back when the feature was being planned, or when it was “fixed” the first time, the whole situation would never have happened.

No matter what, I’m glad it ended up being sorted towards a more friendly solution.

Wiki + Spreadsheet

The underlying concept is very simple: spreadsheets are a way to organize text, numbers and formulas into what might be seen as a natively numeric environment: a matrix. So what would happen if we loosed some of the bolts of the numeric-oriented organization, and tried to reuse the same concepts into a more formatting-oriented environment which is naturally collaborative: a wiki.

While I do encourage you to answer this with some fantastic new online service (please provide me with an account and the best e-book reader device available once you’re rich) I had a try at answering this question myself a while ago by writing the Calc macro for Moin.

Basically, the Calc macro allows extracting values found in a wiki page into lists (think columns or rows), and applying formulas and further formatting as wanted.

I believe there’s a lot of potential on the basic concept, and the prototype, even though functional and useful, surely has a lot to evolve, so I’ve published the project in Launchpad to make contributions easier. I actually apologize for not publishing it earlier. There was hope that more features would be implemented before releasing, but now it’s clear that it won’t get many improvements from me anytime soon. If you do decide to improve it, please try to prepare patches which are mostly ready for integration, including full testing, since I can’t dedicate much time for it myself in the foreseeable future.

Google using Geohash

According to Dave Troy, Google seems to be using the Geohash algorithm:

Google is employing the GeoHash algorithm I’ve been pushing to do spatial searching using BigTable. Since database schemes like BigTable don’t support traditional GIS extensions/spatial indexes, GeoHash allows for a simple bounding box search using truncated GeoHash substrings. I will post separately about this shortly, as I am working on some GeoHash tools to expand this functionality. This is of particular interest to AppEngine developers.

Nice!

dateutil 1.4 is out

Friday I’ve released version 1.4 of dateutil. There are some interesting fixes there, so please upgrade if you have the chance.

Enhancements on geohash.org

Some improvements to geohash.org were made. Some of them were
motivated by a conversation with Rodrigo Stulzer.

  • Support for geocoding addresses (city names, whatever). E.g. http://geohash.org/?q=21 Millbank, London
  • Support for moving the Geohash marker in the embedded map, so that modifying the position visually is easier.
  • Support for providing a “name” to Geohashes, by appending a colon and the name, in a nice format. E.g. http://geohash.org/c216ne:Mt_Hood
  • Provided a bookmark to get a Geohash while in Google Maps.
  • Provided a Google Maps Mapplet. When enabled, it adds a Geohash marker identifying the Geohash position in Google Maps, and it may be moved around. Here is a screenshot:

Check out the Tips & Tricks page for details on these features.

geohash.org is public!

After about one year writing this service in my spare time, it’s finally out.

geohash.org offers short URLs which encode a latitude/longitude pair, so that referencing them in emails, forums, and websites is more convenient.

Geohashes offer properties like arbitrary precision, similar prefixes for nearby positions, and the possibility of gradually removing characters from the end of the code to reduce its size (and gradually lose precision). I’ve put the algorithm created in the public domain. Some details may be seen in the Wikipedia article about it (hopefully that’ll help establishing prior art, and prevent Microsoft from patenting it).

To obtain the Geohash, the user provides latitude and longitude coordinates in a single input box (most commonly used formats for latitude and longitude pairs are accepted), and performs the request.

Besides showing the latitude and longitude corresponding to the given Geohash, users who navigate to a Geohash at geohash.org are also presented with an embedded map, and may download a GPX file, or transfer the waypoint directly to certain GPS receivers. Links are also provided to external sites that may provide further details around the specified location.

Mocker 0.10 and trivial patch-mocking of existing objects

Mocker 0.10 is out, with a number of improvements!

While we’re talking about Mocker, here is another interesting use case, exploring a pretty unique feature it offers.

Suppose we want to test that a method hello() on an object will call self.show(“Hello world!”) at some point. Let’s say that the code we want to test is this:

 class Greeting(object):

     def show(self, sentence):
         print sentence

     def hello(self):
         self.show("Hello world!")

This is the entire test method:

def test_hello(self):
    # Define expectation.
    mock = self.mocker.patch(Greeting)
    mock.show("Hello world!")
    self.mocker.replay()

    # Rock on!
    Greeting().hello()

This has helped me in practice a few times already, when testing some involved situations.

Note that you can also passthrough the call. In other words, the call may actually be made on the real method, and mocker will just assert that the call was really made, whatever the effect is.

One more important point: mocker ensures that the real method exists in the real object, and has a specification compatible with the call made. If it doesn’t, and assertion error is raised in the test with a nice error message.

UPDATE: The method for doing this is actually mocker.patch() rather than mocker.mock(), as documented. Apologies.

Partial stubbing of os.path.isfile() with Mocker

One neat feature which Mocker offers is the ability to very easily implement custom behavior on specific functions or methods.

Take for instance the case where you want to pretend to some code that a given file exists, but you don’t want to get on the way of everything else which needs the same function:

>>> from mocker import *
>>> mocker = Mocker()
>>> isfile = mocker.replace("os.path.isfile", count=False)
>>> _ = expect(isfile("/non/existent")).result(True)
>>> _ = expect(isfile(ANY)).passthrough()

>>> mocker.replay()

>>> import os
>>> os.path.isfile("/non/existent")
True
>>> os.path.isfile("/etc/passwd")
True
>>> os.path.isfile("/other")
False

>>> mocker.restore()

>>> os.path.isfile("/non/existent")
False

Notice that the count=False parameter is available in version 0.9.2. Without it Mocker will act in a more mocking-strict way and enforce that the given expressions should be executed precisely the given number of times (which defaults to one, and may be modified with the count() method).

More releases: dateutil 1.3 and nicefloat 1.1

A couple of additional releases tonight: dateutil 1.3, and nicefloat 1.1.

They’re both bug fixing releases.