Beware SQL libraries missing prepared statement support
Imagine this scenario. You read that prepared statements are a good way to avoid SQL injection, because the database is given code and data explicitly and separately. You chose a database that supports prepared statements. The library you use also seems to support them as you can pass SQL code and data as two separate arguments. However... internally the library just constructs a string and sends that to the database, and doesn't use the database's prepared statement support!
An example is the library "pyPGSQL", which supports PostGreSQL in Python. It has an "execute" command taking a query and parameters as separate arguments. However, internally it constructs a string to send off (after escaping the parameters, so it *shouldn't* be vulnerable):
self.res = self.conn.conn.query(_qstr % parms)
The point is that escaping on the client side, while most likely OK, isn't as robust as letting the database handle the data separately, by using prepared statements. This particularity of pyPGSQL has been known since 2003 (forum answer). However, it's good to point it out again, as I had started writing a program using pyPGSQL, thinking that my code would be fine. It's possible that others have as well. pyPGSQL doesn't claim to support prepared statements (the absence of a "prepare" instruction should have been a clue!). Nevertheless, it surprises me that there still exist libraries not supporting prepared statements and that don't state this with an unmistakable, large warning. I found surprisingly few Python libraries supporting prepared statements:
- py-postgresql (unfortunately requires Python 3; was pg_proboscis for Python 2)
- Cristian Gafton's python-pgsql (not thread-safe)
P.S.: Note that the work-around proposed in the forum link above does not provide the security of prepared statements properly supported by a library.
P.P.S.: To clarify, I haven't demonstrated an SQL injection vulnerability in pyPGSQL. It's not about the performance penalty either. It's about escaping done by the client library (the basic implementation of bind parameters without using the database's support) being a second-rate security solution to explicitly telling the database "here is the code. Now here's the data" (prepared statements). It's about decreasing code complexity and reducing chances for "misunderstandings" (and configuration, e.g., encoding, discrepancies). It's about assurance and choosing safer technologies and architectures.
P(3)S.: Why did I expect prepared statement support? Because the Python DBI 2.0 specification for the "execute" method suggests that implementations should be using prepared statements, at least internally:
"A reference to the operation will be retained by the cursor. If the same operation object is passed in again, then the cursor can optimize its behavior. This is most effective for algorithms where the same operation is used, but different parameters are bound to it (many times)."
One more thought is that I should be more positive and congratulate the people working on Python 3 for fixing this long-standing problem. They deserve kudos!
on Monday, July 20, 2009 at 10:58 PM