Automatic Rollback of Transactions in Spring Tests

Table of Contents

TL;DR

Put @Transactional on your test class or methods and test-managed transactions will automatically be rollbacked.

The Antipattern

When looking at system test in Spring, it is not uncommon to see this pattern:

import org.junit.jupiter.api.Assertions.assertEquals
import org.junit.jupiter.api.extension.ExtendWith
import org.junit.jupiter.api.Test
import org.springframework.beans.factory.annotation.Autowired
import org.springframework.boot.test.context.SpringBootTest
import org.springframework.test.context.junit.jupiter.SpringExtension

@ExtendWith(SpringExtension::class)
@SpringBootTest
class TransactionalTestApplicationTests {

    @Autowired lateinit var mapper: Mapper

    @Test
    fun `insert user into db, verify and do cleanup`() {
        val user = mapper.createUser(username = "foo")

        assertEquals("foo", user.username)

        mapper.deleteUser(username = "foo")
    }
}

The cleanup is done manually, at the end of the test. This introduces a problem: what if the test fails? Now, the cleanup statement at the end is no longer executed. A common solution is to introduce an @After method that does the cleanup in a more or less crude manner:

@After
fun cleanup() {
    mapper.deleteAllUsers()
}

This pattern is in fact unnecessary. Spring provides the @Transactional annotation that will do an automatic cleanup after every @Test method:

@ExtendWith(SpringExtension::class)
@SpringBootTest
@Transactional
class TransactionalTestApplicationTests {

    @Autowired lateinit var mapper: Mapper

    @Test
    fun `insert user into db, verify and do cleanup`() {
        /* code */
    }
}

Difference Between Test-, Spring- and Application-Managed Transactions

A test-managed transaction is one that is managed by TransactionalTestExecutionListener (or programmatically by TestTransaction). A test-managed transaction differs from Spring-managed transactions (transactions directly managed by Spring within the ApplicationContext that is loaded for the test) and application-managed transactions (transactions programmatically managed within the application code that is executed by the test):

Spring-managed and application-managed transactions will typically participate in test-managed transactions; however, caution should be taken if Spring-managed or application-managed transactions are configured with any propagation type other than REQUIRED or SUPPORTS.

Documentation for TransactionalTestExecutionListener

What Does @Transactional Do?

Running tests while starting up the Spring application context, a TransactionalTestExecutionListener is automatically configured. This listener provides support for executing tests with test-managed transactions. It notices the @Transactional on the test class or individual @Test methods and creates a new transaction that is then automatically rolled back after test completion.

The default is rollback. This behavior can be changed, however, by putting @Commit or @Rollback at the class or method level.

If you need more control over your transactions, you can use the static methods in TestTransaction. They allow you to start and end transactions, flag them for commit or rollback or check the transaction status.

Beware: Only Test-Managed Transactions Are Rolled Back

Transactions are not rolled back when the code that is invoked does not interact with the database. An example would be writing an end-to-end test that uses RestTemplate to make an HTTP request to some endpoint that then makes a modification in the database. Since the RestTemplate does not interact with the database (it just creates and sends an HTTP request), @Transactional will not rollback anything that is an effect of that HTTP request. This is because a separate transaction, one not controlled by the test, will be started.

Set Up or Tear Down Outside of a Transaction

You can annotate a public void method with @BeforeTransaction or @AfterTransaction. This indicates to Spring that this method should be run before or after a test method is run within a transaction.

Versions

I wrote this code with Spring Boot version 2.2.5.

Prevent SQL Injection in Bash Scripts

Table of Contents

Problem

When working on making a small application with a Zenity GUI, I ask the user for input. I need to somehow sanitize this input, or I will leave the gates wide open for two problems:

  1. SQL injection.
  2. Command injection.

In this blog post, I’m going to look at a solution for the first problem.

Disclaimer

Of course, you shouldn’t write SQL queries in a programming language that doesn’t provide a way to supply parameterized queries. Also, when a user is executing a shell script themselves on their local machine, SQL and command injection become academic, because the user already has the right to modify the database or to execute random Bash commands. It would only be problematic if the script would be executed by an HTTP server that can reached by the internet at large. But in that case, you wouldn’t be writing a Bash script, you would use a proper programming language, right?

Solving SQL Injection The Normal Way: Parameterized Queries

The solution for SQL injection, is, of course, parameterized queries. You send a query to the database and leave placeholders (AKA “bind variables”) open, that take the form of a question mark or a named parameter:

SELECT  *
FROM    user
WHERE   name = ?
AND     password = ?;
SELECT  *
FROM    user
WHERE   name = :name
AND     password = :password;

The parameters are sent separately to the RDBMS, and are bound to a certain type (varchar, integer, etc) and thus cannot cause unintended side effects anymore.

However, Bash Doesn’t Let You Do This

In Bash, however, it’s hard to use this if the provided executable doesn’t accept parameters. Postgres’ psql takes a -v option, and allows you to write:

$ psql -u user -h host -v 'name=qwerty' -v 'password=12345' \
    'SELECT * FROM user WHERE name = :name AND password = :password;'

MariaDB/MySQL, however, provides nothing of the kind.

IFS='|' read -r description amount < <(zenity --forms \
        --add-entry description \
        --add-entry amount \
        --text 'Please specify a description and amount')
# assume env variable MYSQL_PWD is set
mysql -u user -h host \
    -e "INSERT INTO advance VALUES ('$description', $amount);"

The code fragment above leaves us completely exposed to SQL injection.

Solving SQL Injection In Bash

We could base64 encode the user data and use a function on the RDBMS to decode from base64.

IFS='|' read -r us_description us_amount < <(zenity --forms \
        --add-entry Description \
        --add-entry Amount \
        --text 'Please specify a description and amount')
description_base64="$(echo -n "$us_description" | base64)"
amount=$(parse_float "$us_amount") || { >&2 echo 'Invalid float'; exit 1; }
# assume MYSQL_PWD is set
mysql -u user -h host \
    -e "INSERT INTO advance VALUES (FROM_BASE64('$description_base64'), $amount));"

Why Does This Work?

You might think to yourself: why does this actually work? Doesn’t the function get expanded and you end up with the same problem?

Let’s figure out how a SQL statement gets processed by the RDBMS.

When the RDBMS receives the SQL statement, it is handled by a SQL interpreter. The interpreter consists of two parts: a parser and an optimizer. First, the parser checks for syntax errors. If the syntax is correct, it will check whether the statement is semantically correct: for example, do all referenced tables and columns exist? The parser separates the pieces of a SQL statement into a data structure that other routines can process. (As we’ll see, this is important: after parsing, SQL injection is no longer possible.)

After the interpreter is done with the statement, the statement is handed to the optimizer. The optimizer tries to determine how to get the best performance. The result is an execution plan, which is fed to a code generator and finally run by the database processor.

Schematically:

SQL Statement Processing
SQL statement processing

(Read here how Oracle processes statements).

A function is evaluated during the execution phase. The parser has no problem with a valid, existing function that promises to return a value that is expected in that position. The optimizer’s task is only to determine at what moment in the execution phase the function is best executed, but also doesn’t evaluate the function.

We can prove all this to ourselves by calling a function that will throw an error:

MariaDB [foo]> \W
Show warnings enabled.
MariaDB [huishouden]> SELECT FROM_BASE64('!');
+------------------+
| FROM_BASE64('!') |
+------------------+
| NULL             |
+------------------+
1 row in set, 1 warning (0.004 sec)

Warning (Code 1958): Bad base64 data as position 0
MariaDB [foo]> -- no warning will show because function is not evaluated
MariaDB [foo]> SELECT FROM_BASE64('!') WHERE 1 = 0;
Empty set (0.003 sec)

Calling FROM_BASE64 with an invalid base64 string ! results in a runtime exception. When we add a predicate in the WHERE clause that always fails (1 = 0), the query gets executed nonetheless, showing us that MariaDB/MySQL will only evaluate the function at runtime. At runtime, the return value of FROM_BASE64 cannot be used anymore to alter the SQL statement, because that stage is long past. At this point in time, the return value is harmless.

Software I used

$ uname -a
Linux cirrus7 5.5.3-arch1-1 #1 SMP PREEMPT Tue, 11 Feb 2020 15:35:41 +0000 x86_64 GNU/Linux
$ mysql --version
mysql  Ver 15.1 Distrib 10.4.12-MariaDB, for Linux (x86_64) using readline 5.1
$ zenity --version
3.32.0

Skipping or Selectively Running Tests with Maven

Table of Contents

Skip Unit or Integration Tests

Skip the execution of unit tests:

$ mvn compile -DskipTests

This will still compile the unit tests, but not run them. To skip both compilation and execution, use -Dmaven.test.skip. Note that this setting is also honored by the Failsafe plugin.

Skip executing integration tests:

$ mvn verify -DskipITs

Execute a Single Unit or Integration Test Class

Execute a single unit test class with Surefire:

$ mvn test -Dtest=MyUnitTest

In a multi-module project, we need to add -DfailIfNoTests=false, otherwise a module will fail where a class with that name is not available.

Also note that we do not execute integration tests because we instruct Maven to stop after the test phase. To run a single unit test class and skip all integration tests, add -DskipITs.

Execute a single integration test class with Failsafe:

$ mvn verify -Dit.test=MyIntegrationTest

As you can guess by now, this will run the single integration test by the name of MyIntegrationTest, and will also run all unit tests. To prevent this, you can add -DskipTests. Again, don’t forget to set -DfailIfNoTests if you run Maven on a multi-module project.

Execute a Single Unit or Integration Test Within a Test Class

Both the Surefire and Failsafe Maven plugins allow you to go crazy in their syntax for filtering which individual test classes or methods should be run. Specifying the parameters -Dtest and -Dit.test will override all defined include all defined includes and excludes from your pom.xml. Maven will create an include that looks like **/${test}.java.

We have seen how we can run single test classes by specifying the class name after -Dtest or -Dit.test. The next most simple format (since version 2.7.3) takes the form of ClassName#methodName, so if you have a test class named OrderTest.java and a unit test method testEmptyOrderShouldThrow, you can run this test method by itself, excluding all other test by typing:

$ mvn test -Dtest=OrderTest#testEmptyOrderShouldThrow

Complexer syntax with both glob and regex patterns is possible since version 2.19. The documentation gives this example:

$ mvn test '-Dtest=???Test, !Unstable*, pkg/**/Ci*leTest.java,
> *Test#test*One+testTwo?????, #fast*+slowTest'

Let’s break this down. As you can see, we specify multiple patterns in a single parameter (to prevent shell expansion, they are put inside single quotes):

  • ???Test will run any test that has a name that starts with any three characters and ends with Test (? is a glob character meaning any single character).
  • !Unstable* will not run any test that starts with Unstable (the initial ! negates the pattern).
  • pkg/**/Ci*leTest.java will run any test that has pkg somewhere in its package name, followed by any number of other child package names, while the class name should start with Ci followed by zero or more wildcard characters and ending with leTest.java (pkg/CileTest.java matches, as does foo/bar/pkg/baz/CiqwertyleTest.java).
  • *Test#test*One+testTwo????? will run any test whose class name ends in Test and whose method name matches test + any amount of characters + One or testTwo + any five characters. So this will run two test methods in every test class called *Test.
  • #fast*+slowTest will run the methods fast* and slowTest where ever there are found.

We can even have more fun by throwing in regular expressions:

$ mvn test '-Dtest=Basic*, !%regex[.*.Unstable.*],
> !%regex[.*.MyTest.class#one.*|two.*], %regex[#fast.*|slow.*]'
  • !%regex[.*.Unstable.*] will not run any tests that contain Unstable somewhere in their path.
  • !%regex[.*.MyTest.class#one.*|two.*] will run methods named one and two in a class called MyTest. We see that when using globs, we can use forward slashes to delimit packages. However, when using regex, we use dots.
  • %regex[#fast.*|slow.*] will run the test methods fast and slow wherever they are found. Note that (at least in version 3.0.0-M4), the trailing .* wildcard pattern is not needed.

Further Reading

You can read more about Surefire and Failsafe from the comfort of your terminal:

$ mvn surefire:help -Ddetail
$ mvn failsafe:help -Ddetail

You can also read the Surefire and Failsafe online documentation:

Grokking Time Zones in Oracle’s DATE and TIMESTAMP

Table of Contents

Intro

When I was tasked with putting some data in BigQuery, Google Cloud’s data warehouse implementation, I had to look at the data we currently store in our local Oracle database. I found that we were using DATE for fields that store “moments in time”. Normally, I would expect Unix timestamps here, or some other data type that contains the time zone.

Naturally, I complained about this, but people were quick to point out that it doesn’t matter, because you should relate the DATE to the time zone of the database. Oracle, it turns out, is tracking two time zones: the time zone of the database itself:

SELECT dbtimezone
  FROM dual;
-- Europe/Amsterdam

and the time zone of the client session:

SELECT sessiontimezone
  FROM dual;
-- Europe/Moscow

(See here how to update these time zones.)

I was warned that Oracle might change the query results based on either of these time zones, meaning when you insert a data type related to dates or time, you might get back an altered representation based on the time zone of the database or the session of the client.

This did not put my mind at ease at all. So, let’s take a look at what some of the common types relating to date and time are in Oracle, and when we should be careful with the query results.

DATEs

A DATE is a calendar date with a wallclock time without a time zone. If you put 2020-02-20 09:11:43 in the database, you will always retrieve the same exact datetime, no matter whether you’re in Amsterdam or New York. (It is the equivalent of a java.time.LocalDateTime, if that’s your thing.)

SELECT sessiontimezone
  FROM dual;
-- Europe/Amsterdam

SELECT def_time
  FROM users
 WHERE username = 'foo@example.com'
-- 2020-02-20 09:04:50

ALTER SESSION SET time_zone = 'UTC';

SELECT def_time
  FROM users
 WHERE username = 'foo@example.com'
-- 2020-02-20 09:04:50

Plain TIMESTAMPs

A TIMESTAMP (without a time zone) is the same beast as a DATE, except that you can specify times up to a fraction of a second (a DATE needs to be rounded to the nearest second).

TIMESTAMP WITH LOCAL TIME ZONE

Say, you store the value 2020-02-20 09:11:43 +01:00. A client with a sessiontimezone of Europe/Amsterdam in the winter will retrieve this exact value whether or not they retrieve it from an Amsterdam, Sydney or New York database. Another client with a session time zone that is set to America/New_York, however, will get the value 2020-02-20 03:11:43 -05:00.

So: LOCAL means it adjusts the retrieved value to your session time zone.

ALTER SESSION SET time_zone = 'Europe/Moscow';  -- offset +03:00

SELECT to_char("ttz", 'yyyy-MM-dd HH24:MI:SS')
FROM (-- timestamp with local, New York time zone
      SELECT CAST(
              to_timestamp_tz('2020-02-20 09:11:43 -05:00',
                              'yyyy-MM-dd HH24:MI:SS TZH:TZM')
              AS TIMESTAMP WITH LOCAL TIME ZONE) AS "ttz"
      FROM dual);
-- 2020-02-20 17:11:43

See also this tutorial.

TIMESTAMP WITH TIME ZONE

This data type stores the time zone together with the datetime and displays it independently from the session time zone of the client.

If we store a value 2020-02-20 09:11:43 +01:00 in Amsterdam and copy this to a database in New York, this is the value that is returned to clients in both Amsterdam and New York regardless of their session time zone.

Conclusion

DATEs and plain TIMESTAMPs don’t store a time zone and queries will always return the exact same value that was inserted regardless of dbtimezone or sessiontimezone. This makes them safe choices if you somehow don’t care about time zones.

TIMESTAMP WITH LOCAL TIME ZONE and TIMESTAMP WITH TIME ZONE do store time zones. A TIMESTAMP WITH LOCAL TIME ZONE will be represented differently depending on your session time zone. A TIMESTAMP WITH TIME ZONE will not be represented differently; it will just include an offset.

You might now always want to use TIMESTAMP WITH TIME ZONE, but apparently, they have problems of their own.

Falling back to DATEs, however, is problematic because of Daylight Saving Time (DST) in the fall. When you store the current datetime in the fall, and you go through the dance of setting all clocks back from 3:00 to 2:00, you have no way of knowing at what exact point in time the row was inserted. This is bad if you rely on this for anything crucial, such as forensics (“so the hacker copied the entire database at 2:30, but was that before or after we reset our clocks?”).

Using Oracle DATEs in Java

As for me, having to deal with DATEs without an explicit time zone, this means that the application inserting/retrieving these datetimes will need to decide how to interpret them. In my Java application, I will map them to java.time.LocalDateTime, add a time zone and convert them to java.time.Instants:

Instant defTimeInstant = user.getDefTime()
        .atZone(ZoneId.of("Europe/Moscow"))
        .toInstant();

user.getDefTime() returns a LocalDateTime indicating when the user was created. We can give a LocalDateTime a time zone by invoking atZone(ZoneId) on it, returing a ZonedDateTime. Once we have converted the datetime to an actual point on the timeline (no longer being some kind of abstract datetime without a timezone), we can convert it to an Instant by invoking toInstant() on the ZonedDateTime.

Sources

Oracle documentation on datetime data types.

Oracle SQL*Plus Cheatsheet

Table of Contents

Intro

Last week, I suddenly had to work with an Oracle database again. I normally use Intellij’s DataGrip to connect to databases. I tried it this time, and I found I could not connect to the schema I wanted: the schema just turned up empty. Of course, everybody will recommend you use Oracle’s SQL Developer with any Oracle database you have to touch. So, after trying brew search sqldeveloper (yes, I’m on a Mac at work), coming up empty, reading this cask request and feeling the anticipation of endless frustration grow inside me, I went to Oracle’s web site to see if I could download the program. I can, except that they want me to turn in a DNA sample first:

Oracle's create-new-account page
Oracle’s create-new-account page

Of course, faking those kind of details is not impossible, but the hassle of going through something like that just for a lousy program so I can just run a couple of lousy queries puts me off.

Luckily, I had a Docker container with Oracle XE lying around. It includes SQL*Plus, the “venerable” command-line tool provided by Oracle to query and administer Oracle databases. Being a Arch Linux user with a predilection for anything with a command-line interface, it was not too bad, but getting the output formatted so that I could actually make sense of it required more effort than most users are willing to put up with.

So how do you find your way around when you have SQL*Plus and no clue how it works? How do you find the schema’s you want, the tables you need to query? How can you check your privileges?

This post aims to be a cheatsheet so you can find your way around. Keep in mind that the results you’re getting depend on the privileges your user has. Getting an empty result does not mean that the database object does not exist. Rather, it means you’re a lesser god.

Getting Help With sqlplus

sqlplus does not have a man page, but provides help when you pass -h/-help:

$ sqlplus -h

Connecting to an Oracle Database Using SQL*Plus

The basic syntax to connect as user alice with password qwerty to a database FOO which is located on db.domain.tld and listens on port 1521 (default port) is:

$ sqlplus alice/qwerty@db.domain.tld:1521/FOO

Show the Connected User

SHOW USER;

The SHOW command lets you look at the current state of your SQLPlus environment: SHOW ALL shows all settings.

Show All Schema’s

SELECT username
  FROM dba_users u;

Return only non-empty schema’s (excluding most users who never created any object in the database):

SELECT username
  FROM dba_users u
 WHERE EXISTS (-- filter users without database objects
               SELECT 1
               FROM dba_objects o
               WHERE o.owner = u.username);

Excluding Oracle’s built-in schema’s:

SELECT username
  FROM dba_users
 WHERE default_tablespace NOT IN ('SYSTEM', 'SYSAUX');

Source

Show All Tables/Views in a Particular Schema

SELECT table_name
  FROM all_tables
 WHERE owner = 'MYSCHEMA';

Related: find all views:

SELECT view_name, text
  FROM all_views
 WHERE owner = 'MYSCHEMA';

Describe a Table

DESCRIBE table;

Show DDL of a Table

SELECT dbms_metadata.get_ddl('VIEW', 'USERS', 'MYSCHEMA')
  FROM dual;

where the first argument is the type of object (e.g. 'TABLE', 'VIEW', 'TRIGGER'), the second is the name of the object, and the third the schema where the object is defined.

Show the Privileges of Your User

SELECT * FROM USER_SYS_PRIVS;
SELECT * FROM USER_TAB_PRIVS;
SELECT * FROM USER_ROLE_PRIVS;

Source

Get More Decent Output From a Command

If you want SQL*Plus to truncate the value of a column:

SET WRAP OFF;

otherwise it will allow the column to wrap to the next line (default ON).

Suppress all headings, page breaks, titles, the initial blank line and other formatting information:

SET PAGESIZE 0;

Source

Now, if only Oracle would support \G for vertical output…

Get and Alter Database Timezones

SELECT dbtimezone FROM dual;
ALTER DATABASE SET dbtimezone = 'Europe/Amsterdam';
SELECT sessiontimezone FROM dual;
ALTER SESSION SET sessiontimezone = 'Europe/Amsterdam';

Select DATE Rows in the Last Hour

Table t has a column c_date of type DATE:

SELECT *
  FROM table t
 WHERE c_date > sysdate - 1/24;

This works because you can subtract a fraction from a DATE type where the fraction is interpreted as a fraction of a day (1 is an entire day, 0.5 is 12 hours, etc.).

Working Around Oracle Not Having a LIMIT

Yes, Oracle does not have a LIMIT keyword, so this idiom will quickly become ingrained in your muscle memory:

SELECT *
FROM (-- actual query
      SELECT firstName, lastName
      FROM users)
WHERE ROWNUM <= 10;

For each row returned by a query, the ROWNUM pseudocolumn returns a number indicating the order in which Oracle selects the row from a table or set of joined rows. The first row selected has a ROWNUM of 1, the second has 2, and so on.

Oracle documentation

You could do without the subquery, plainly writing:

SELECT firstName, lastName
FROM users
WHERE ROWNUM <= 10;

but if you add a ORDER BY clause, Oracle will first select the first ten rows, and only then apply the ORDER BY clause, which might not be what you want. So that’s why it’s best to always use the first idiom above.

MariaDB/MySQL Password Exposure in Bash

Table of Contents

How to Provide a Password to mysql?

Recently, I was writing a small Bash program with a GUI (using zenity) that would prompt the user for their MariaDB credentials and then retrieve some information. Passing username and password to the mysql program looks like:

$ mysql -u user -h host -ppassword mydatabase -e 'select 1;'

(Note that there cannot be a space between -p and the beginning of the password.)

Now, this obviously works fine, except for two problems:

  1. We cannot reuse the connection to the database, so we have to make a new connection every time we want to execute a query.
  2. The password of user user is given in the command itself. This means that everyone on the same machine can list the processes and see the command-line arguments passed to mysql.

In this post, I want to talk about the second problem.

$ pgrep -af mysql
8046 mysql -u user -h 192.168.123.5 -px xxxxxxxxxxxxxxxxxxxxxxx mydatabase
$ ls -l /proc/8046/cmdline
-r--r--r-- 1 neftas neftas 0 16 feb 10:33 /proc/8046/cmdline
$ cat -et /proc/8046/cmdline
mysql^@-u^@stefan^@-h^@192.168.123.5^@-px^@xxxxxxxxxxxxxxxxxxxxxxx^@mydatabase^@

As we can see, mysql replaces the provided password with xes, but at the same time warns in man mysql that

Specifying a password on the command line should be considered insecure. You can use an option file to avoid giving the password on the command line.

The option file man mysql is talking about is either one of the configuration files located at $HOME/.my.cnf or /etc/my.cnf, or a custom one specified on the command line. So, one option would be to put the credentials in $HOME/.my.cnf before executing the program, stopping this issue cold in its tracks.

But let’s say that we want to dynamically connect to a database. Asking the user to provide credentials and then write them to a file would provide the same issues as passing it directly to the mysql command. The only difference would be that we now pass the sensitive information to different command:

printf '[client]\nhost=%s\nusername=%s\npassword=%s\n' \
    host username password > creds.tmp
mysql --defaults-extra-file=creds.tmp mydatabase

The Environment Solution

Although not mentioned in my version of man mysql (10.4 from 29 March 2019), the safest way, it turns out, is to set an environment variable named MYSQL_PWD. Why is this safer?

$ ls -l /proc/8046/environ
-r-------- 1 neftas neftas 0 16 feb 11:00 /proc/8046/environ

At least on Linux, environment variables are only readable by the user that started the process. This means other users (beware of root) cannot look at them.

To summarize:

declare -x MYSQL_PWD
IFS='|' read -r username MYSQL_PWD < <(zenity --password --username)
mysql -u "$username" -h host mydatabase

declare -x declares a variable that is going to be exported on assignment. This eliminates the need to export MYSQL_PWD explicitly somewhere down the road: when MYSQL_PWD gets a value assigned, it will export that value.

In the second line, I use process substitution (see Process Substitution in man bash) to provide a pipe-separated string to read that will split the string and assign the values to username and MYSQL_PWD.

Now that MYSQL_PWD is set, we no longer have to worry about providing the password directly to the mysql command.

Versions Used In This Post

This post was written using the following versions:

$ uname -a
Linux cirrus7 5.5.3-arch1-1 #1 SMP PREEMPT Tue, 11 Feb 2020 15:35:41 +0000 x86_64 GNU/Linux
$ mysql --version
mysql  Ver 15.1 Distrib 10.4.12-MariaDB, for Linux (x86_64) using readline 5.1
$ zenity --version
3.32.0

What’s the Difference Between a Login and a Nonlogin Shell?

Table of Contents

This is addressed nicely in the book Unix Power Tools (Oreilly):

When you first log in to a Unix system from a terminal, the system normally starts a login shell. A login shell is typcally the top-level shell in the “tree” of processes that starts with the init process. Many characteristics of processes are passed from parent to child process down this “tree” — especially environment variables, such as the search path. The changes you make in a login shell will affect all the other processes that the top-level shell starts — including any subshells.

So, a login shell is where you do general setup that’s done only the first time you log in — initialize your terminal, set environment variables, and so on. […]

So you could think about a login shell as a shell that is started at startup by the init process (or systemd nowadays). Or as a shell that logs you into the system by your providing a username and a password. A nonlogin shell, by contrast, is a shell that is invoked without logging anybody in.

Is My Current Shell a Login Shell?

There are two ways to check if your current shell is a login shell: First, you can check the output of echo $0: if it starts with a dash (like -bash), it’s a login shell. Be aware, however, that you can start a login shell with bash --login, and echo $0 will output just bash without the leading dash, so this is not a surefire way of find out if you are running a login shell.

Secondly, the Unix StackOverflow offers this way of finding out:

$ shopt -q login_shell && echo login || echo nonlogin

(-q supresses the output of the shopt command.)

The Difference Between Login and Nonlogin That Actually Matters

Practically speaking, the difference between a login shell and a nonlogin shell is in the configuration files that Bash reads when it starts up. In particular, according to man bash:

[…] it first reads and executes commands from the file /etc/profile, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the first one that exists and is readable.

You can observe this behavior by putting echo commands in /etc/profile, ~/.bash_profile, ~/.bash_login and ~/.profile. Upon invoking bash --login you should see:

echo from /etc/profile
echo from ~/.bash_profile
$

If the shell is a nonlogin shell, Bash reads and executes commands from ~/.bashrc. Since we are starting a nonlogin shell from within a login shell, it will inherit the environment. Sometimes, this will lead to confusion when we inadvertantly get a login shell, and find out that our configuration from ~/.bashrc is not loaded. This is why many people put something like the following in their .bash_profile:

[[ -r ~/.bashrc ]] && source ~/.bashrc

This test whether .bashrc is readable and then sources it.

Why You Sometimes Want a Login Shell

When you switch users using su you will take the environment of the calling user with you. To prevent this, you should use su - which is short for su --login. This acts like a clean login for a new user, so the environment will not be cluttered with values from the calling user. Just as before, a login shell will read /etc/profile and the .bash_profile of the user you are switching to, but not its .bashrc. This post on StackOverflow shows why you might want to prefer to start with a clean environment (spoiler: your $PATH might be “poisened”).

Conclusion

In this article we saw that the main difference between a login and a nonlogin shell are the configuration files that are read upon startup. We then looked at what the benefits are of a login shell over a nonlogin shell.

Encrypt Device With Veracrypt From the Command Line

Table of Contents

You have a drive that you want to encrypt and use in Linux and other OSes. Then Veracrypt, the successor of Truecrypt, is a good choice. The prerequisite for this tutorial is that you already have created a partition on a drive. See my previous blog post on how to accomplish that. Creating a volume on a partition with data on it will permanently destroy that data, so make sure you are encrypting the correct partition (fdisk -l is your friend).

Encrypt a volume interactively from the command line using Veracrypt…

(The # sign at the beginning of the code examples indicates that the command should be executed as root. You can either use su - or sudo to accomplish this.)

# veracrypt -t --quick -c /dev/sdXX

-t is short for --text (meaning you don’t want the GUI) and should always be used first after the command name. The --quick option is explained in the docs:

If unchecked, each sector of the new volume will be formatted. This means that the new volume will be entirely filled with random data. Quick format is much faster but may be less secure because until the whole volume has been filled with files, it may be possible to tell how much data it contains (if the space was not filled with random data beforehand). If you are not sure whether to enable or disable Quick Format, we recommend that you leave this option unchecked. Note that Quick Format can only be enabled when encrypting partitions/devices.

So, using --quick is less secure, but not specifying it could take (a lot) longer, especially on traditional hard drives (we’re talking hours for 500GB).

Finally, the -c or --create command allows us to specify on which partition we want to create a veracrypt volume. Make sure you change the /dev/sdXX from the example above to the appropriate output of fdisk -l (for example, /dev/sdc1).

This command will interactively guide us to create a new volume:

Volume type:
 1) Normal
 2) Hidden
Select [1]: 1

Encryption Algorithm:
 1) AES
 2) Serpent
 3) Twofish
 4) Camellia
 5) Kuznyechik
 6) AES(Twofish)
 7) AES(Twofish(Serpent))
 8) Camellia(Kuznyechik)
 9) Camellia(Serpent)
 10) Kuznyechik(AES)
 11) Kuznyechik(Serpent(Camellia))
 12) Kuznyechik(Twofish)
 13) Serpent(AES)
 14) Serpent(Twofish(AES))
 15) Twofish(Serpent)
Select [1]: 1

Hash algorithm:
 1) SHA-512
 2) Whirlpool
 3) SHA-256
 4) Streebog
Select [1]: 1

Filesystem:
 1) None
 2) FAT
 3) Linux Ext2
 4) Linux Ext3
 5) Linux Ext4
 6) NTFS
 7) exFAT
Select [2]: 6

Enter password:
WARNING: Short passwords are easy to crack using brute force techniques!

We recommend choosing a password consisting of 20 or more characters. Are you sure you want to use a short password? (y=Yes/n=No) [No]: y

Re-enter password:

Enter PIM:

Enter keyfile path [none]:

Please type at least 320 randomly chosen characters and then press Enter:
Characters remaining: 4

Done: 100.000%  Speed: 61.8 GB/s  Left: 0 s

The VeraCrypt volume has been successfully created.

The volume is now created in the partition and is ready to be mounted.

… Or do it all in a one-liner

# veracrypt --text --quick                      \
        --non-interactive                       \
        --create /dev/sdXX                      \
        --volume-type=normal                    \
        --encryption=AES                        \
        --hash=SHA-512                          \
        --filesystem=NTFS                       \
        --password='Un$@f3'

Use --stdin to read the password from the standard in, instead of supplying it directly to the command, which is considered unsecure.

Mounting the volume

# mkdir /tmp/vera
# veracrypt -t /dev/sdXX /tmp/vera

Unmouting the volume

# veracrypt -d /tmp/vera

More info

$ veracrypt -t -h

-h is short for --help and should be self-explanatory.

Make less Options Permanent (or: the Missing .lessrc)

Table of Contents

The missing $HOME/.lessrc

I often wondered how I could make certain options for less permanent, like -I, for example, which will make search case insensitive. In GNU/Linux, preferences are often stored in rc files. For vim we have .vimrc, for Bash .bashrc, etc:

$ find "$HOME" -maxdepth 1 -name '*rc'
./.vimrc
./.idlerc
./.xinitrc
./.lynxrc
./.old_netrc
./.inputrc
./.bashrc
./.rtorrent.rc
./.sqliterc
./.xdvirc

Environment variable LESS

So, it would make sense to expect a .lessrc. But there is none. Instead, we define a environment variable LESS. My .bashrc:

export LESS="IFRSX"

Breakdown:

  • -I: ignore case when searching
  • -F: quit immediately when the entire file fits in one screen (in effect, mimic cat’s behavior)
  • -R: enable colored output (for example, when piping to less from diff --color=always)
  • -S: truncate long lines instead of wrapping them to the next line
  • -X: don’t clear screen on exit

See man 1 less for all options.

Make a Backup With rsync

Table of Contents

We want to make a backup of data, for example to an external hard drive.

The basic command

Assuming we are in someone’s home directory and we want to copy three source directories Music, Documents and Movies to a destination directory /mnt/external-hdd:

$ rsync -a Music Documents Movies /mnt/external-hdd

A word on slashes

Notice that we omit the trailing forward slash / on the source directories. This means the destination will look like:

/mnt/external-hdd
|-- Music
|   |-- a.mp3
|-- Documents
|   |-- b.txt
|-- Movies
|   |-- c.mp4

If we were to add trailing forward slashes, the upper-level source directories would not be copied and the result would look like:

/mnt/external-hdd
|-- a.mp3
|-- b.txt
|-- c.mp4

Read more about slashes in rsync.

The rsync -a command broken down

rsync -a is equal to rsync --archive and is a convenience command. According to the man page, it equals rsync -rlptgoD.

  • -r or --recurse: recursively copy data
  • -l or --links: copy symlinks as symlinks
  • -p or --perms: preserve permissions
  • -t or --times: preserve modifications times
  • -g or --group: preserve the group
  • -o or --owner: preserve owner
  • -D is the same as --devices --specials:
    • --devices: preserve device files
    • --specials: preserve special files

[…] a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file

Wikipedia

Device files in Linux are usually found under the /dev directory.

See the overall progress of rsync

By default, rsync will show the progress of the individual files that are being copied. If you want the overall progress, you have to add some flags:

$ rsync -a --info=progress2 --no-i-r src dst

--info=progress2 shows the total transfer progress. (To see all available options for --info, execute rsync --info=help). --no-i-r is short for --no-inc-recursive and disables incremental recursion, forcing rsync to do a complete scan of of all directories before starting the file transfer. This is needed to get an accurate progress report, otherwise rsync doesn’t know how much work is left.

Human-readable output can be obtained by passing the -h or --human-readable option.

For a discussion of these options, see also this StackOverflow post.