-
Notifications
You must be signed in to change notification settings - Fork 11
Taint Analysis for PostreSQL
Node RASP includes the PostgreSQL parser library libpg_query
to run taint-lexical analysis on queries right before sending them to the database. The basic idea is, that a user should only be in control of a single, lexical token. A user should not be able to add further tokens after the one given control of. Therefore, taint ranges should always be in between single, lexical tokens with one exception - comments.
The default lexer resp. the scanner of libpg_query
and PostgreSQL treat all comments as whitespaces since they contain no semantical information for the database. For the use case of PostgreSQL injection detection, comment tokens are crucial. Let's take the following example:
db.query("SELECT * FROM tab WHERE foo='" + param1 + "' AND tenant=42;");
By injecting a comment --
as param1 only a single lexical token is created, but further conditions are also commented out. As a result, the semantics of our query has changed and the attack was successful. Therefore, we require the lexer to also return comment tokens.
Our approach here is to start with the scanner generation within PostgreSQL itself as libpg_query
does. PostgreSQL uses the tool Flex to generate the C implementation of the scanner scan.c
from a Flex file scan.f
.
In the first step, we check out the version of PostgreSQL we would like the scanner to be extracted from.
git clone https://github.com/postgres/postgres.git
git checkout <tag> // currently REL_10_1
Since we later want to integrate into the current release of libpg_query
only the Flex file scan.l
is required. Copy it to your working directory.
cp src/backend/parser/scan.l <yourWorkingDirectory>
According to the build of libpg_query
we first run some patches on the Flex file.
patch -p1 scan.l ~/libpg_query/patches/01_parse_replacement_char.patch
At this step the changes to the actual scanner behavior are made. In our case, the scanner should return comment tokens. This can be done by adding the following in scan.l
:
{comment} {
SET_YYLLOC();
return COMMENT;
}
This change can be seen in commit. A definition for the returned COMMENT token already exists.
flex -CF -o scan.c scan.l
Since libpg_query makes further modifications to the scanner during the build process, we only want to copy the transition table as well as the token switch from scan.c
to src/postgres/src_backend_parser_scan.c
. Unfortunately, this has currently be done manually.
Build your libpg_query
with the adapted scanner and adapt test conflicting with the new scanner output.
Statement used for benchmark:
const stmt = "SELECT * FROM Customers WHERE tenant=1 AND (name ~ '" +
search + "' OR notes ~ '" +
search + "' OR product ~ '" +
search + "');";
Node Version | Description | Results | Deviation |
---|---|---|---|
v8.11.1 | localhost, no taint | ~310,000 ops/sec | ±50% |
v8.11.1-taint | localhost, no taint | ~310,000 ops/sec | ±50% |
v8.11.1-taint | localhost, taint on search - no attack | ~18,000 ops/sec | ±4% |
v8.11.1-taint | localhost, taint on entire query - injection | ~7,000 ops/sec | ±4% |
v8.11.1-taint | localhost, pg_check.hasInjection(query) - no taint | ~21,000,000 ops/sec | ±1% |
v8.11.1-taint | localhost, pg_check.hasInjection(query) - taint | ~21,000 ops/sec | ±1% |
v8.11.1-taint | localhost, libpg_query.getLexicalTokens(query) | ~24,000 ops/sec | ±1% |
Previous test were run asynchronously, so the latency and database processing wan't part of the benchmark!
Node Version | Description | Results | Deviation |
---|---|---|---|
v8.11.1 | remote database, no taint | ~83,000 ops/sec | ±43% |
v8.11.1-taint | remote database, no taint | ~80,000 ops/sec | ±39% |
v8.11.1-taint | remote database, taint on search - no attack | ~19,000 ops/sec | ±5% |
v8.11.1-taint | remote database, taint on search - injection | ~4,500 ops/sec | ±3% |
Since for the taint-lexical analysis only a small part of the parser resp. libpg_query
is required, it may be reasonable to create a smaller lib only containing the scanner e.g. libpg_scanner
.