Beware SQL libraries missing prepared statement support

Monday, July 20, 2009 by Pascal Meunier
in General

Just because your library or framework allows you to specify an SQL query and the data separately, doesn't mean that it's sending data separately from code to the database.

Imagine this scenario. You read that prepared statements are a good way to avoid SQL injection, because the database is given code and data explicitly and separately. You chose a database that supports prepared statements. The library you use also seems to support them as you can pass SQL code and data as two separate arguments. However... internally the library just constructs a string and sends that to the database, and doesn't use the database's prepared statement support!

An example is the library "pyPGSQL", which supports PostGreSQL in Python. It has an "execute" command taking a query and parameters as separate arguments. However, internally it constructs a string to send off (after escaping the parameters, so it *shouldn't* be vulnerable):
self.res = self.conn.conn.query(_qstr % parms)

The point is that escaping on the client side, while most likely OK, isn't as robust as letting the database handle the data separately, by using prepared statements. This particularity of pyPGSQL has been known since 2003 (forum answer). However, it's good to point it out again, as I had started writing a program using pyPGSQL, thinking that my code would be fine. It's possible that others have as well. pyPGSQL doesn't claim to support prepared statements (the absence of a "prepare" instruction should have been a clue!). Nevertheless, it surprises me that there still exist libraries not supporting prepared statements and that don't state this with an unmistakable, large warning. I found surprisingly few Python libraries supporting prepared statements:

py-postgresql (unfortunately requires Python 3; was pg_proboscis for Python 2)
Cristian Gafton's python-pgsql (not thread-safe)

I am not going to discuss why attempting to escape data in an SQL statement is complex and error-prone, and why prepared statements are a more secure alternative; this has been addressed elsewhere and supported by vulnerability announcements. Have you checked if and how your library or framework really uses prepared statements, or does it just look like it might be using them?

P.S.: Note that the work-around proposed in the forum link above does not provide the security of prepared statements properly supported by a library.

P.P.S.: To clarify, I haven't demonstrated an SQL injection vulnerability in pyPGSQL. It's not about the performance penalty either. It's about escaping done by the client library (the basic implementation of bind parameters without using the database's support) being a second-rate security solution to explicitly telling the database "here is the code. Now here's the data" (prepared statements). It's about decreasing code complexity and reducing chances for "misunderstandings" (and configuration, e.g., encoding, discrepancies). It's about assurance and choosing safer technologies and architectures.

P(3)S.: Why did I expect prepared statement support? Because the Python DBI 2.0 specification for the "execute" method suggests that implementations should be using prepared statements, at least internally:

            "A reference to the operation will be retained by the
            cursor.  If the same operation object is passed in again,
            then the cursor can optimize its behavior.  This is most
            effective for algorithms where the same operation is used,
            but different parameters are bound to it (many times)."

One more thought is that I should be more positive and congratulate the people working on Python 3 for fixing this long-standing problem. They deserve kudos!

Tags for this post:

Leave a comment (6 so far) »

Comments

Posted by Sam Quigley
on Monday, July 20, 2009 at 10:58 PM

ActiveRecord, the ORM used in Ruby on Rails applications, also only pretends to support prepared statements. See, eg, here:

http://github.com/rails/rails/blob/37658f15bb88e054635a496327a4a82bb50fd5d5/activerecord/lib/active_record/base.rb#L2379

for the code that replaces the ‘?’ characters in the SQL template with (quoted) values.

Posted by Craig Younkins
on Tuesday, July 21, 2009 at 08:30 AM

Probably the most widely used SQL toolkit in Python is SQLAlchemy, and it claims to use bind parameters.

http://www.sqlalchemy.org/docs/05/sqlexpression.html#bind-parameter-objects

I haven’t looked at the code though, and it may depend on which DB you are using.

Posted by Glyph Lefkowitz
on Wednesday, July 29, 2009 at 04:36 PM

This sounds like a very serious bug in pypgsql. It’s a violation of the supposed contract of the interface being described.

Granted, proper prepared statement support would be a better way to deal with this

Please report a bug here so that the problem can be dealt with in the proper place, the code for pypgsql, and not a million workarounds in applications.

You can report bugs on the pypgsql tracker on sourceforge.

In the meanwhile, you could consider using an alternate interface to your postgresql database, such as psycopg.

(Not to say it doesn’t have its own problems, it just doesn’t have any quite this serious, to my knowledge.)

I tried to link to these things, but the comment form kept telling me that I am “not authorized to perform this action”.

Posted by Pascal Meunier
on Thursday, July 30, 2009 at 05:58 AM

Thanks Craig. There was some talk about SQLAlchemy not supporting prepared statements (mail-archive sqlalchemy-users msg02468.html) despite supporting bind parameters, although it’s from 2006. Perhaps they have added it since then? I can’t figure it out from the documentation, which doesn’t discuss prepared statements at all. So, it probably doesn’t. <BR>
Also, there is some harsh criticism (related to attempting to “reproduce prepared statements in the SQLAlchemy code”) of it at:<BR>
“blog.pas-un-geek-en-tant-que-tel.ch (2008/11/04) SQLAlchemy: _With_ancient_methods_against_modern_IT/

It also turns out that python-pgsql isn’t thread-safe, returning odd results (or none) from within threads. I found the latest version of pg_proboscis v.1.05, but I can’t get it to work right, either with the DBI or greenthread interfaces; some queries work and some others inexplicably don’t, returning cryptic error messages.

Posted by Pascal Meunier
on Thursday, July 30, 2009 at 06:24 AM

Thanks Glyph,
However, psycopg also doesn’t support prepared statements, according to a Feb 17, 2004 message on the psycopg list by the developer:

lists.initd.org/pipermail/psycopg/2004-February/002572.html

Although there’s a possibility that v.2 might support it, but it’s unclear. I found no mention of prepared statements in psycopg’s documentation and FAQ. I’m really disappointed by the state of database support in Python 2.x.

Posted by Pascal Meunier
on Thursday, July 30, 2009 at 07:08 AM

Glyph,
I found a more recent thread (2007/3/9), where Federico Di Gregorio, the developer of psycopg, confirms that it doesn’t have true support for prepared statements:
lists.initd.org/pipermail/psycopg/2007-March/005391.html

Page 1 of 1 pages

Beware SQL libraries missing prepared statement support

Comments

Leave a comment

Recent Blogs

Archive