About 1 year after development started in Ensemble, today the stars finally aligned just the right way (review queue mostly empty, no other pressing needs, etc) for me to start writing the specification about the repository system we’ve been jointly planning for a long time. This is the system that the Ensemble client will communicate with for discovering which formulas are available, for publishing new formulas, for obtaining formula files for deployment, and so on.
Category Archives: PostgreSQL
Comparing package versions in PostgreSQL
This weekend I’ve played a bit with PostgreSQL extensions written in C.
A while ago I wrote a Python C extension for Smart to compare Debian package versions. Now I was trying to do something similar inside PostgreSQL, and thus ported the original Python C extension code to a PostgreSQL C extension. It enables queries like the following:
# SELECT 'Matched' WHERE deb_version_match('1.2', '<=', '2.0');
?column?
----------
Matched
The implementation of the PostgreSQL C extension was quite straightforward, but I'm a bit disappointed by the performance of PL/PGSQL.
I've made tests using two environments. One of them is a PL/Python function executed as a trigger, which calls the original Python C extension and executes SQL back in PostgreSQL using the plpy module. The other is a PL/PGSQL function which uses the implemented PostgreSQL C extension directly.
Considering that the function logic consisted of one loop over a SELECT statement, a few tests, and an INSERT statement, I was expecting that the overhead introduced by going back and forth between the PostgreSQL state and the Python interpreter state would be a lot more noticeable when compared with PL/PGSQL executing an internal PostgreSQL function. Tests have shown about 10% of improvement roughly, when doing a similar logic over about 5000 items.
I'm not yet sure if the speed improvement pays off the limited debugging feedback provided by the PL/PGSQL interpreter on errors.