Commands in Pipelines Are Executed in Subshells

Let’s have a look at another one of those gotchas that can bite you while scripting in Bash: processes that are part of a pipeline all run in subshells.

What would be result of executing the following Bash script?

var=foo
echo bar | read var
echo $var

If you said “foo”, you would be right. For me, this was really counter-intuitive. What is going on?

As always, bash(1) has the answer. It explains under “SHELL GRAMMAR” > “Pipelines”:

Each command in a multi-command pipeline, where pipes are created, is executed in a subshell, which is a separate process.

And also, under “COMMAND EXECUTION ENVIRONMENT”:

Changes made to the subshell environment cannot affect the shell’s execution environment.

That means a pipeline can actually be represented as:

<new, separate process> | <another new, separate process>

Let’s prove this to ourselves. Under “PARAMETERS” > “Special Parameters”, Bash’s man page explains that Bash expands $$:

[…] to the process ID of the shell. In a subshell, it expands to the process ID of the current shell, not the subshell.

$ echo $$; >&2 echo $$ | echo $$
2443
2443
2443

To get the PID of the subshell, Bash provides BASHPID (documented under “PARAMETERS” > “Shell Variables”):

Expands to the process ID of the current bash process. This differs from $$ under certain circumstances, such as subshells that do not require bash to be re-initialized.

$ echo Parent: $BASHPID; >&2 echo Sub 1: $BASHPID | echo Sub 2: $BASHPID
Parent: 2443
Sub 2: 10599
Sub 1: 10598

Here, the parent shell has PID 2443, and the two subshells have 10599 and 10598 respectively.