Commands in Pipelines Are Executed in Subshells
Let’s have a look at another one of those gotchas that can bite you while scripting in Bash: processes that are part of a pipeline all run in subshells.
What would be result of executing the following Bash script?
var=foo
echo bar | read var
echo $var
If you said “foo”, you would be right. For me, this was really counter-intuitive. What is going on?
As always, bash(1)
has the answer. It explains under
“SHELL GRAMMAR” > “Pipelines”:
Each command in a multi-command pipeline, where pipes are created, is executed in a subshell, which is a separate process.
And also, under “COMMAND EXECUTION ENVIRONMENT”:
Changes made to the subshell environment cannot affect the shell’s execution environment.
That means a pipeline can actually be represented as:
<new, separate process> | <another new, separate process>
Let’s prove this to ourselves. Under “PARAMETERS” > “Special
Parameters”, Bash’s man page explains that Bash expands $$
:
[…] to the process ID of the shell. In a subshell, it expands to the process ID of the current shell, not the subshell.
$ echo $$; >&2 echo $$ | echo $$
2443
2443
2443
To get the PID of the subshell, Bash provides BASHPID
(documented
under “PARAMETERS” > “Shell Variables”):
Expands to the process ID of the current bash process. This differs from
$$
under certain circumstances, such as subshells that do not require bash to be re-initialized.
$ echo Parent: $BASHPID; >&2 echo Sub 1: $BASHPID | echo Sub 2: $BASHPID
Parent: 2443
Sub 2: 10599
Sub 1: 10598
Here, the parent shell has PID 2443, and the two subshells have 10599 and 10598 respectively.