Prevent SQL Injection in Bash Scripts

Table of Contents

Problem

When working on making a small application with a Zenity GUI, I ask the user for input. I need to somehow sanitize this input, or I will leave the gates wide open for two problems:

  1. SQL injection.
  2. Command injection.

In this blog post, I’m going to look at a solution for the first problem.

Disclaimer

Of course, you shouldn’t write SQL queries in a programming language that doesn’t provide a way to supply parameterized queries. Also, when a user is executing a shell script themselves on their local machine, SQL and command injection become academic, because the user already has the right to modify the database or to execute random Bash commands. It would only be problematic if the script would be executed by an HTTP server that can reached by the internet at large. But in that case, you wouldn’t be writing a Bash script, you would use a proper programming language, right?

Solving SQL Injection The Normal Way: Parameterized Queries

The solution for SQL injection, is, of course, parameterized queries. You send a query to the database and leave placeholders (AKA “bind variables”) open, that take the form of a question mark or a named parameter:

SELECT  *
FROM    user
WHERE   name = ?
AND     password = ?;
SELECT  *
FROM    user
WHERE   name = :name
AND     password = :password;

The parameters are sent separately to the RDBMS, and are bound to a certain type (varchar, integer, etc) and thus cannot cause unintended side effects anymore.

However, Bash Doesn’t Let You Do This

In Bash, however, it’s hard to use this if the provided executable doesn’t accept parameters. Postgres’ psql takes a -v option, and allows you to write:

$ psql -u user -h host -v 'name=qwerty' -v 'password=12345' \
    'SELECT * FROM user WHERE name = :name AND password = :password;'

MariaDB/MySQL, however, provides nothing of the kind.

IFS='|' read -r description amount < <(zenity --forms \
        --add-entry description \
        --add-entry amount \
        --text 'Please specify a description and amount')
# assume env variable MYSQL_PWD is set
mysql -u user -h host \
    -e "INSERT INTO advance VALUES ('$description', $amount);"

The code fragment above leaves us completely exposed to SQL injection.

Solving SQL Injection In Bash

We could base64 encode the user data and use a function on the RDBMS to decode from base64.

IFS='|' read -r us_description us_amount < <(zenity --forms \
        --add-entry Description \
        --add-entry Amount \
        --text 'Please specify a description and amount')
description_base64="$(echo -n "$us_description" | base64)"
amount=$(parse_float "$us_amount") || { >&2 echo 'Invalid float'; exit 1; }
# assume MYSQL_PWD is set
mysql -u user -h host \
    -e "INSERT INTO advance VALUES (FROM_BASE64('$description_base64'), $amount));"

Why Does This Work?

You might think to yourself: why does this actually work? Doesn’t the function get expanded and you end up with the same problem?

Let’s figure out how a SQL statement gets processed by the RDBMS.

When the RDBMS receives the SQL statement, it is handled by a SQL interpreter. The interpreter consists of two parts: a parser and an optimizer. First, the parser checks for syntax errors. If the syntax is correct, it will check whether the statement is semantically correct: for example, do all referenced tables and columns exist? The parser separates the pieces of a SQL statement into a data structure that other routines can process. (As we’ll see, this is important: after parsing, SQL injection is no longer possible.)

After the interpreter is done with the statement, the statement is handed to the optimizer. The optimizer tries to determine how to get the best performance. The result is an execution plan, which is fed to a code generator and finally run by the database processor.

Schematically:

SQL Statement Processing
SQL statement processing

(Read here how Oracle processes statements).

A function is evaluated during the execution phase. The parser has no problem with a valid, existing function that promises to return a value that is expected in that position. The optimizer’s task is only to determine at what moment in the execution phase the function is best executed, but also doesn’t evaluate the function.

We can prove all this to ourselves by calling a function that will throw an error:

MariaDB [foo]> \W
Show warnings enabled.
MariaDB [huishouden]> SELECT FROM_BASE64('!');
+------------------+
| FROM_BASE64('!') |
+------------------+
| NULL             |
+------------------+
1 row in set, 1 warning (0.004 sec)

Warning (Code 1958): Bad base64 data as position 0
MariaDB [foo]> -- no warning will show because function is not evaluated
MariaDB [foo]> SELECT FROM_BASE64('!') WHERE 1 = 0;
Empty set (0.003 sec)

Calling FROM_BASE64 with an invalid base64 string ! results in a runtime exception. When we add a predicate in the WHERE clause that always fails (1 = 0), the query gets executed nonetheless, showing us that MariaDB/MySQL will only evaluate the function at runtime. At runtime, the return value of FROM_BASE64 cannot be used anymore to alter the SQL statement, because that stage is long past. At this point in time, the return value is harmless.

Software I used

$ uname -a
Linux cirrus7 5.5.3-arch1-1 #1 SMP PREEMPT Tue, 11 Feb 2020 15:35:41 +0000 x86_64 GNU/Linux
$ mysql --version
mysql  Ver 15.1 Distrib 10.4.12-MariaDB, for Linux (x86_64) using readline 5.1
$ zenity --version
3.32.0

Grokking Time Zones in Oracle’s DATE and TIMESTAMP

Table of Contents

Intro

When I was tasked with putting some data in BigQuery, Google Cloud’s data warehouse implementation, I had to look at the data we currently store in our local Oracle database. I found that we were using DATE for fields that store “moments in time”. Normally, I would expect Unix timestamps here, or some other data type that contains the time zone.

Naturally, I complained about this, but people were quick to point out that it doesn’t matter, because you should relate the DATE to the time zone of the database. Oracle, it turns out, is tracking two time zones: the time zone of the database itself:

SELECT dbtimezone
  FROM dual;
-- Europe/Amsterdam

and the time zone of the client session:

SELECT sessiontimezone
  FROM dual;
-- Europe/Moscow

(See here how to update these time zones.)

I was warned that Oracle might change the query results based on either of these time zones, meaning when you insert a data type related to dates or time, you might get back an altered representation based on the time zone of the database or the session of the client.

This did not put my mind at ease at all. So, let’s take a look at what some of the common types relating to date and time are in Oracle, and when we should be careful with the query results.

DATEs

A DATE is a calendar date with a wallclock time without a time zone. If you put 2020-02-20 09:11:43 in the database, you will always retrieve the same exact datetime, no matter whether you’re in Amsterdam or New York. (It is the equivalent of a java.time.LocalDateTime, if that’s your thing.)

SELECT sessiontimezone
  FROM dual;
-- Europe/Amsterdam

SELECT def_time
  FROM users
 WHERE username = 'foo@example.com'
-- 2020-02-20 09:04:50

ALTER SESSION SET time_zone = 'UTC';

SELECT def_time
  FROM users
 WHERE username = 'foo@example.com'
-- 2020-02-20 09:04:50

Plain TIMESTAMPs

A TIMESTAMP (without a time zone) is the same beast as a DATE, except that you can specify times up to a fraction of a second (a DATE needs to be rounded to the nearest second).

TIMESTAMP WITH LOCAL TIME ZONE

Say, you store the value 2020-02-20 09:11:43 +01:00. A client with a sessiontimezone of Europe/Amsterdam in the winter will retrieve this exact value whether or not they retrieve it from an Amsterdam, Sydney or New York database. Another client with a session time zone that is set to America/New_York, however, will get the value 2020-02-20 03:11:43 -05:00.

So: LOCAL means it adjusts the retrieved value to your session time zone.

ALTER SESSION SET time_zone = 'Europe/Moscow';  -- offset +03:00

SELECT to_char("ttz", 'yyyy-MM-dd HH24:MI:SS')
FROM (-- timestamp with local, New York time zone
      SELECT CAST(
              to_timestamp_tz('2020-02-20 09:11:43 -05:00',
                              'yyyy-MM-dd HH24:MI:SS TZH:TZM')
              AS TIMESTAMP WITH LOCAL TIME ZONE) AS "ttz"
      FROM dual);
-- 2020-02-20 17:11:43

See also this tutorial.

TIMESTAMP WITH TIME ZONE

This data type stores the time zone together with the datetime and displays it independently from the session time zone of the client.

If we store a value 2020-02-20 09:11:43 +01:00 in Amsterdam and copy this to a database in New York, this is the value that is returned to clients in both Amsterdam and New York regardless of their session time zone.

Conclusion

DATEs and plain TIMESTAMPs don’t store a time zone and queries will always return the exact same value that was inserted regardless of dbtimezone or sessiontimezone. This makes them safe choices if you somehow don’t care about time zones.

TIMESTAMP WITH LOCAL TIME ZONE and TIMESTAMP WITH TIME ZONE do store time zones. A TIMESTAMP WITH LOCAL TIME ZONE will be represented differently depending on your session time zone. A TIMESTAMP WITH TIME ZONE will not be represented differently; it will just include an offset.

You might now always want to use TIMESTAMP WITH TIME ZONE, but apparently, they have problems of their own.

Falling back to DATEs, however, is problematic because of Daylight Saving Time (DST) in the fall. When you store the current datetime in the fall, and you go through the dance of setting all clocks back from 3:00 to 2:00, you have no way of knowing at what exact point in time the row was inserted. This is bad if you rely on this for anything crucial, such as forensics (“so the hacker copied the entire database at 2:30, but was that before or after we reset our clocks?”).

Using Oracle DATEs in Java

As for me, having to deal with DATEs without an explicit time zone, this means that the application inserting/retrieving these datetimes will need to decide how to interpret them. In my Java application, I will map them to java.time.LocalDateTime, add a time zone and convert them to java.time.Instants:

Instant defTimeInstant = user.getDefTime()
        .atZone(ZoneId.of("Europe/Moscow"))
        .toInstant();

user.getDefTime() returns a LocalDateTime indicating when the user was created. We can give a LocalDateTime a time zone by invoking atZone(ZoneId) on it, returing a ZonedDateTime. Once we have converted the datetime to an actual point on the timeline (no longer being some kind of abstract datetime without a timezone), we can convert it to an Instant by invoking toInstant() on the ZonedDateTime.

Sources

Oracle documentation on datetime data types.

Oracle SQL*Plus Cheatsheet

Table of Contents

Intro

Last week, I suddenly had to work with an Oracle database again. I normally use Intellij’s DataGrip to connect to databases. I tried it this time, and I found I could not connect to the schema I wanted: the schema just turned up empty. Of course, everybody will recommend you use Oracle’s SQL Developer with any Oracle database you have to touch. So, after trying brew search sqldeveloper (yes, I’m on a Mac at work), coming up empty, reading this cask request and feeling the anticipation of endless frustration grow inside me, I went to Oracle’s web site to see if I could download the program. I can, except that they want me to turn in a DNA sample first:

Oracle's create-new-account page
Oracle’s create-new-account page

Of course, faking those kind of details is not impossible, but the hassle of going through something like that just for a lousy program so I can just run a couple of lousy queries puts me off.

Luckily, I had a Docker container with Oracle XE lying around. It includes SQL*Plus, the “venerable” command-line tool provided by Oracle to query and administer Oracle databases. Being a Arch Linux user with a predilection for anything with a command-line interface, it was not too bad, but getting the output formatted so that I could actually make sense of it required more effort than most users are willing to put up with.

So how do you find your way around when you have SQL*Plus and no clue how it works? How do you find the schema’s you want, the tables you need to query? How can you check your privileges?

This post aims to be a cheatsheet so you can find your way around. Keep in mind that the results you’re getting depend on the privileges your user has. Getting an empty result does not mean that the database object does not exist. Rather, it means you’re a lesser god.

Getting Help With sqlplus

sqlplus does not have a man page, but provides help when you pass -h/-help:

$ sqlplus -h

Connecting to an Oracle Database Using SQL*Plus

The basic syntax to connect as user alice with password qwerty to a database FOO which is located on db.domain.tld and listens on port 1521 (default port) is:

$ sqlplus alice/qwerty@db.domain.tld:1521/FOO

Show the Connected User

SHOW USER;

The SHOW command lets you look at the current state of your SQLPlus environment: SHOW ALL shows all settings.

Show All Schema’s

SELECT username
  FROM dba_users u;

Return only non-empty schema’s (excluding most users who never created any object in the database):

SELECT username
  FROM dba_users u
 WHERE EXISTS (-- filter users without database objects
               SELECT 1
               FROM dba_objects o
               WHERE o.owner = u.username);

Excluding Oracle’s built-in schema’s:

SELECT username
  FROM dba_users
 WHERE default_tablespace NOT IN ('SYSTEM', 'SYSAUX');

Source

Show All Tables/Views in a Particular Schema

SELECT table_name
  FROM all_tables
 WHERE owner = 'MYSCHEMA';

Related: find all views:

SELECT view_name, text
  FROM all_views
 WHERE owner = 'MYSCHEMA';

Describe a Table

DESCRIBE table;

Show DDL of a Table

SELECT dbms_metadata.get_ddl('VIEW', 'USERS', 'MYSCHEMA')
  FROM dual;

where the first argument is the type of object (e.g. 'TABLE', 'VIEW', 'TRIGGER'), the second is the name of the object, and the third the schema where the object is defined.

Show the Privileges of Your User

SELECT * FROM USER_SYS_PRIVS;
SELECT * FROM USER_TAB_PRIVS;
SELECT * FROM USER_ROLE_PRIVS;

Source

Get More Decent Output From a Command

If you want SQL*Plus to truncate the value of a column:

SET WRAP OFF;

otherwise it will allow the column to wrap to the next line (default ON).

Suppress all headings, page breaks, titles, the initial blank line and other formatting information:

SET PAGESIZE 0;

Source

Now, if only Oracle would support \G for vertical output…

Get and Alter Database Timezones

SELECT dbtimezone FROM dual;
ALTER DATABASE SET dbtimezone = 'Europe/Amsterdam';
SELECT sessiontimezone FROM dual;
ALTER SESSION SET sessiontimezone = 'Europe/Amsterdam';

Select DATE Rows in the Last Hour

Table t has a column c_date of type DATE:

SELECT *
  FROM table t
 WHERE c_date > sysdate - 1/24;

This works because you can subtract a fraction from a DATE type where the fraction is interpreted as a fraction of a day (1 is an entire day, 0.5 is 12 hours, etc.).

Working Around Oracle Not Having a LIMIT

Yes, Oracle does not have a LIMIT keyword, so this idiom will quickly become ingrained in your muscle memory:

SELECT *
FROM (-- actual query
      SELECT firstName, lastName
      FROM users)
WHERE ROWNUM <= 10;

For each row returned by a query, the ROWNUM pseudocolumn returns a number indicating the order in which Oracle selects the row from a table or set of joined rows. The first row selected has a ROWNUM of 1, the second has 2, and so on.

Oracle documentation

You could do without the subquery, plainly writing:

SELECT firstName, lastName
FROM users
WHERE ROWNUM <= 10;

but if you add a ORDER BY clause, Oracle will first select the first ten rows, and only then apply the ORDER BY clause, which might not be what you want. So that’s why it’s best to always use the first idiom above.