Archive for the 'Java' Category

Breaking into an Android password manager – Theory

For some time now I’ve been wanting to research more deeply about the internals of Android. Until now, though, this was just a sentiment. Then, a couple of weeks ago I’ve finally managed to replace my iPhone for an Android phone, and that was the final motivator for me to actually get into learning more about the inner workings of the Linux-based OS.

Now, I just had to pick an actual task for digging into. The Dalvik VM is certainly one of the most innovative and advertised technical details about the OS, so something around it would be a nice start.. some kind of bytecode fiddling perhaps, but what? Luckily, even without trying too hard, I eventually stumbled upon an interesting case for researching upon.

The “victim” of this research is the application gbaSafe version 1.1.0a, which claims to protect user passwords using unbreakable algorithms (how’s that for a hint of a Snake oil case?).

Before we get into some hacking, let’s see some words on the software security by the author himself, and then render some analysis on conceptual issues on it:

The confidential data can only be decrypted if the master key is known. You should choose a long key (at least 16 characters) with mixed case and unreadable text. Of course you cannot enter this key each time you want to access the confidential data, so it is stored in the user settings encrypted with a shorter key (4 to 6 digits) and normally you only have to enter this unlock key. Theoretically it is possible to try all possible values (brute force attack), but then you must use another program, since gbaSafe deletes the encrypted master key from the user settings when you enter the unlock key wrong three times repeatedly, and then you must enter the master key. If you wrote a program to decrypt the master key, you would have to know the algorithm used, the salt bytes and iteration count (used to augment the short unlock key), which are very hard to extract from the binary program module gbaSafe.

If you have some security background, I’m sure that by now you’re already counting the issues on this single paragraph.

The most obvious issue is the fact that there’s a “strong key” and a “weak key”, and the strong key is encrypted with the weak one. This is a very common cryptography sin, as would say my friend and coworker Andreas Hasenack (a security researcher himself). A security system is only as secure as its weakest spot. It obviously makes little difference for an attacker if he has to attempt decrypting a master key or the actual data, since decrypting the master key will give access to the data.

Then, it mentions en passant that the software enforces the use of digits for the weak key. This ensures that the weak key is really weak! Four digits is basically ten thousand attempts, which is absolutely nothing for nowadays’s hardware. This number would move up to about 15 million by simply allowing upper and lowercase letters as well (which isn’t great either, but a few orders of magnitude never hurt in this scenario).

It follows up encouraging people to think that it’s actually hard to figure the algorithm and other implementation details. Considering that there’s absolutely nothing preventing people from getting their hands in the implementation itself, this is in fact asserting that the security mechanism is based on the ignorance of the attacker. Counting on the ignorance of people is bad at all times, and in a security context it’s a major error.

There’s a final security issue in this description which is a bit more subtle, but further analysis on the logic used leaves no doubt. In cryptography, the salt is supposed to increase the work needed in a brute force attack by strengthening the number of bits of the actual passphrase, in a case where the salt is actually unavailable, or at least prevent that a single large word dictionary can be used to attack several encryptions or hashes at once, in a case where the salt is known but variable. In the latter case, it helps because encrypting a single key with two different salts must be done twice, rather than once, so it increases the computational task when attacking multiple items. A salt which is known and does not change across all processed items is worth pretty close to nothing.

So, indeed, considering the many security issues here, this isn’t something I’d store my passwords or credit card numbers on, and I suggest you don’t do it either.

In my next post on this topic I’ll actually implement a trivial brute force attack to prove that these issues are very real, and that, actually, it’s not even hard to break into a security system like this.

The application author has been contacted about this blog post, since he’ll likely want to fix some of these issues.

The last 4 years (and the next N?)

Some interesting changes have been happening in my professional life, so I wanted to share it here to update friends and also for me to keep track of things over time (at some point I will be older and will certainly laugh at what I called “interesting changes” in the ol’days). Given the goal, I apologize but this may come across as more egocentric than usual, so please feel free to jump over to your next blog post at any time.

It’s been little more than four years since I left Conectiva / Mandriva and joined Canonical, in August of 2005. Shortly after I joined, I had the luck of spending a few months working on the different projects which the company was pushing at the time, including Launchpad, then Bazaar, then a little bit on some projects which didn’t end up seeing much light. It was a great experience by itself, since all of these projects were abundant in talent. Following that, in the beginning of 2006, counting on the trust of people which knew more than I did, I was requested/allowed to lead the development of a brand new project the company wanted to attempt. After a few months of research I had the chance to sit next to Chris Armstrong and Jamu Kakar to bootstrap the development of what is now known as the Landscape distributed systems management project.

Fast forward three and a half years, in mid 2009, and Landscape became a massive project with hundreds of thousands of very well tested lines, sprawling not only a client branch, but also external child projects such as the Storm Object Relational Mapper, in use also by Launchpad and Ubuntu One. In the commercial side of things it looks like Landscape’s life is just starting, with its hosted and standalone versions getting more and more attention from enterprise customers. And the three guys which started the project didn’t do it alone, for sure. The toy project of early 2006 has grown to become a well structured team, with added talent spreading areas such as development, business and QA.

While I wasn’t watching, though, something happened. Facing that great action, my attention was slowly being spread thinly among management, architecture, development, testing, code reviews, meetings, and other tasks, sometimes in areas not entirely related, but very interesting of course. The net result of increased attention sprawl isn’t actually good, though. If it persists, even when the several small tasks may be individually significant, the achievement just doesn’t feel significant given the invested effort as a whole. At least not for someone that truly enjoys being a software architect, and loves to feel that the effort invested in the growth of a significant working software is really helping people out in the same magnitude of that investment. In simpler words, it felt like my position within the team just wasn’t helping the team out the same way it did before, and thus it was time for a change.

Last July an external factor helped to catapult that change. Eucalyptus needed a feature to be released with Ubuntu 9.10, due in October, to greatly simplify the installation of some standard machine images.. an Image Store. It felt like a very tight schedule, even more considering that I hadn’t been doing Java for a while, and Eucalyptus uses some sexy (and useful) new technology called the Google Web Toolkit, something I had to get acquainted with. Two months looked like a tight schedule, and a risky bet overall, but it also felt like a great opportunity to strongly refocus on a task that needed someone’s attention urgently. Again I was blessed with trust I’m thankful for, and by now I’m relieved to look back and perceive that it went alright, certainly thanks to the help of other people like Sidnei da Silva and Mathias Gug. Meanwhile, on the Landscape side, my responsibilities were distributed within the team so that I could be fully engaged on the problem.

Moving this forward a little bit we reach the current date. Right now the Landscape project has a new organizational structure, and it actually feels like it’s moving along quite well. Besides the internal changes, a major organizational change also took place around Landscape over that period, and the planned restructuring led me to my current role. In practice, I’m now engaging into the research of a new concept which I’m hoping to publish openly quite soon, if everything goes well. It’s challenging, it’s exciting, and most importantly, allows me to focus strongly on something which has a great potential (I will stop teasing you now). In addition to this, I’ll definitely be spending some of that time on the progress of Landscape and the Image Store, but mostly from an architectural point of view, since both of these projects will have bright hands taking care of them more closely.

Sit by the fireside if you’re interested in the upcoming chapters of that story. ;-)

Changing people or changing rules

In my previous post I made an open statement which I’d like to clarify a bit further:

(…) when the rules don’t work for people, the rules should be changed, not the people.

This leaves a lot of room for personal interpretation of what was actually meant, and TIm Hoffman pointed that out nicely with the following questioning in a comment:

I wonder when the rule is important enough to change the people though. For instance [, if your] development process is oriented to TDD and people don’t write the tests or do the job poorly will you change them then?

This is indeed a nice scenario to explore the idea. If it happens at some point that a team claims to be using TDD, but if in practice no developer actually writes tests first, the rules are clearly not working. If everyone in the team hates doing TDD, enforcing it most probably won’t show its intended benefits, and that was the heart of my comment. You can’t simply keep the rule as is if no one follows it, unless you don’t really care about the outcome of the rule.

One interesting point, though, is that when you have a high level of influence over the environment in which people are, it may be possible to tweak the rules or the processes to adapt to reality, and tweaking the processes may change the way that people feel about the rules as a consequence (arguably, changing people as a side effect).

As a more concrete example, if I found myself in the described scenario, I’d try to understand why TDD is not working, and would try to discuss with the team to see how we should change the process so that it starts to work for us somehow. Maybe what would be needed is more discussion to show the value of TDD, and perhaps some pair programming with people that do TDD very well so that the joy of doing it becomes more visible.

In either case, I wouldn’t be simply asking people “Everyone has to do TDD from now on!“, I’d be tweaking the process so that it feels better and more natural to people. Then, if nothing similar works either, well, let’s change the rule. I’d try to use more conventional unit testing or some other system which people do follow more naturally and that presents similar benefits.

Class member access control: enforcement vs. convention

For a long time I’ve been an advocate of Python’s notion of controlling access to private and protected members (attributes, methods, etc) with conventions, by simply naming them like “_name”, with an initial underline.  Even though Python does support the “__name” (with double underscore) for “private” members (this actually mangles the name rather than hiding it), you’ll notice that even this is rarely used in practice, and the largely agreed mantra is that convention should be enough and thus one underscore suffices. This always resonated quite well with me, since I generally prefer to handle situations by agreement rather than enforcement. Well, I’m now changing my opinion.that this works well for this purpose, at least in certain situations.

This methodology may work quite well in situations where the code scope is within a very controlled environment, with one or more teams which follow strictly a single development guideline, and have the power to refactor the affected code base somewhat easily when the original decisions are too limiting.

Having worked on a few major projects now, and some of them being libraries which are used by several teams within the same company or outside, I now perceive that people very often take shortcuts over these decisions for getting their job done quickly. It’s way easier to simply read the code and get to the private guts of a library than to try to get agreement over the right way to do something, or sending a patch with a suggested change which was carefully architected.

Many people by now are probably thinking: “Well, that’s their problem, isn’t it? If their code base breaks on the next upgrade they’ll get burden and won’t be able to upgrade cleanly.”, and I can honestly understand this feeling, since I shared it. But, for a number of reasons, I now understand that this isn’t just their problem, it’s very much my problem too.

Most importantly, on any serious software, these problems will usually come back to the implementors, and many times the problem will have a much larger magnitude by then than they had at the time a change could have been done “the right way” on the implementation, because code dependent on the private bits will have settled.

Most people are optimist by nature and believe that the implementation won’t change, but, of course, one of the reasons why private information is made private in the first place is exactly because the implementor believes that having the freedom to change these details in the future is important, and not rarely there’s already a plan of evolution in place for these private pieces, which may include revamping the implementation entirely for scalability or for other goals.

In the best case, the careless people will get burden on the upgrade and will ask for support or simply won’t upgrade silently, and both cases hurt implementors, because providing support for broken software takes time and energy, and amazingly can even hurt the software image. Lack of upgrades also means more ancient versions in the wild to give support for. Besides these, in the worst case scenario, the careless people have enough influence on the affected project to cause as much burden on it as if the private data was public in the first place.

As much as I’m a believer in handling situation by agreement rather than enforcement, I’m also a believer that when the rules don’t work for people, the rules should be changed, not the people. So my positioning now is that the language supported access constraints (public, protected, private), as available in languages like Java and C++, are a better alternative when compared to convention as used today in Python, since they provide an additional layer of encouragement for people to not break the rules carelessly, and that helps in the maintenance and reuse of software that has greater visibility.

Jython links

I have collected a few cool links about Jython, and I’d like to keep them somewhere. So, here they are:

Java bits

For the first time in my life, I’ve been paid to implement something in the Java language. Conectiva has been contracted to implement X500 address book support in a known Brazilian groupware tool, for the Brazilian government. This tool runs on top of the TomCat server, and I must mention that I’ve used an interesting feature of the Java language to debug it remotely: the Java Platform Debugger Architecture (JPDA) (thanks JSwat developers).