SQL Injection in Bash Scripts

Problem

When working on making a small application with a Zenity GUI, I ask the user for input. I need to somehow sanitize this input, or I will leave the gates wide open for two problems:

In this blog post, I’m going to look at a solution for the first problem.

Disclaimer

Of course, you shouldn’t write SQL queries in a programming language that doesn’t provide a way to supply parameterized queries. Also, when a user is executing a shell script themselves on their local machine, SQL and command injection become academic, because the user already has the right to modify the database or to execute random Bash commands. It would only be problematic if the script would be executed by an HTTP server that can reached by the internet at large. But in that case, you wouldn’t be writing a Bash script, you would use a proper programming language, right?

Solving SQL Injection The Normal Way: Parameterized Queries

The solution for SQL injection, is, of course, parameterized queries. You send a query to the database and leave placeholders (AKA “bind variables”) open, that take the form of a question mark or a named parameter:

SELECT  *
FROM    user
WHERE   name = ?
AND     password = ?;

SELECT  *
FROM    user
WHERE   name = :name
AND     password = :password;

The parameters are sent separately to the RDBMS, and are bound to a certain type (varchar, integer, etc) and thus cannot cause unintended side effects anymore.

However, Bash Doesn’t Let You Do This

In Bash, however, it’s hard to use this if the provided executable doesn’t accept parameters. Postgres’ psql takes a -v option, and allows you to write:

$ psql -u user -h host -v 'name=qwerty' -v 'password=12345' \
    'SELECT * FROM user WHERE name = :name AND password = :password;'

MariaDB/MySQL, however, provides nothing of the kind.

IFS='|' read -r description amount < <(zenity --forms \
        --add-entry description \
        --add-entry amount \
        --text 'Please specify a description and amount')
# assume env variable MYSQL_PWD is set
mysql -u user -h host \
    -e "INSERT INTO advance VALUES ('$description', $amount);"

The code fragment above leaves us completely exposed to SQL injection.

Solving SQL Injection In Bash

We could base64 encode the user data and use a function on the RDBMS to decode from base64.

IFS='|' read -r us_description us_amount < <(zenity --forms \
        --add-entry Description \
        --add-entry Amount \
        --text 'Please specify a description and amount')
description_base64="$(echo -n "$us_description" | base64)"
amount=$(parse_float "$us_amount") || { >&2 echo 'Invalid float'; exit 1; }
# assume MYSQL_PWD is set
mysql -u user -h host \
    -e "INSERT INTO advance VALUES (FROM_BASE64('$description_base64'), $amount));"

Why Does This Work?

You might think to yourself: why does this actually work? Doesn’t the function get expanded and you end up with the same problem? Let’s figure out how a SQL statement gets processed by the RDBMS.

When the RDBMS receives the SQL statement, it is handled by a SQL interpreter. The interpreter consists of two parts: a parser and an optimizer. First, the parser checks for syntax errors. If the syntax is correct, it will check whether the statement is semantically correct: for example, do all referenced tables and columns exist? The parser separates the pieces of a SQL statement into a data structure that other routines can process. (As we’ll see, this is important: after parsing, SQL injection is no longer possible.) After the interpreter is done with the statement, the statement is handed to the optimizer. The optimizer tries to determine how to get the best performance. The result is an execution plan, which is fed to a code generator and finally run by the database processor.

Schematically:

SQL statement processing (Read here how Oracle processes statements). A function is evaluated during the execution phase. The parser has no problem with a valid, existing function that promises to return a value that is expected in that position. The optimizer’s task is only to determine at what moment in the execution phase the function is best executed, but also doesn’t evaluate the function. We can prove all this to ourselves by calling a function that will throw an error:

MariaDB [foo]> \W
Show warnings enabled.
MariaDB [huishouden]> SELECT FROM_BASE64('!');
+------------------+
| FROM_BASE64('!') |
+------------------+
| NULL             |
+------------------+
1 row in set, 1 warning (0.004 sec)

Warning (Code 1958): Bad base64 data as position 0
MariaDB [foo]> -- no warning will show because function is not evaluated
MariaDB [foo]> SELECT FROM_BASE64('!') WHERE 1 = 0;
Empty set (0.003 sec)

Calling FROM_BASE64 with an invalid base64 string ! results in a runtime exception. When we add a predicate in the WHERE clause that always fails (1 = 0), the query gets executed nonetheless, showing us that MariaDB/MySQL will only evaluate the function at runtime. At runtime, the return value of FROM_BASE64 cannot be used anymore to alter the SQL statement, because that stage is long past. At this point in time, the return value is harmless.

Software I used

$ uname -a
Linux cirrus7 5.5.3-arch1-1 #1 SMP PREEMPT Tue, 11 Feb 2020 15:35:41 +0000 x86_64 GNU/Linux
$ mysql --version
mysql  Ver 15.1 Distrib 10.4.12-MariaDB, for Linux (x86_64) using readline 5.1
$ zenity --version
3.32.0