Comparing package versions in PostgreSQL

This weekend I’ve played a bit with PostgreSQL extensions written in C.

A while ago I wrote a Python C extension for Smart to compare Debian package versions. Now I was trying to do something similar inside PostgreSQL, and thus ported the original Python C extension code to a PostgreSQL C extension. It enables queries like the following:

# SELECT 'Matched' WHERE deb_version_match('1.2', '<=', '2.0');
  ?column?
----------
 Matched

The implementation of the PostgreSQL C extension was quite straightforward, but I'm a bit disappointed by the performance of PL/PGSQL.

I've made tests using two environments. One of them is a PL/Python function executed as a trigger, which calls the original Python C extension and executes SQL back in PostgreSQL using the plpy module. The other is a PL/PGSQL function which uses the implemented PostgreSQL C extension directly.

Considering that the function logic consisted of one loop over a SELECT statement, a few tests, and an INSERT statement, I was expecting that the overhead introduced by going back and forth between the PostgreSQL state and the Python interpreter state would be a lot more noticeable when compared with PL/PGSQL executing an internal PostgreSQL function. Tests have shown about 10% of improvement roughly, when doing a similar logic over about 5000 items.

I'm not yet sure if the speed improvement pays off the limited debugging feedback provided by the PL/PGSQL interpreter on errors.

This entry was posted in C/C++, PostgreSQL, Project, Python. Bookmark the permalink.

One Response to Comparing package versions in PostgreSQL

  1. Neil Conway says:

    You could consider writing the trigger function in C, rather than PL/PgSQL. I wouldn’t think the function call overhead for PL/PgSQL should be too high, but it should be even lower for C UDFs.

Leave a Reply

Your email address will not be published. Required fields are marked *