Implicit Error Detection
Problem
Bash is a strange beast. It can be an interactive shell, in which case you’d like to be notified of failures but not for them to be catastrophic. And when you write scripts, it’s easy to be torn about what you want. Sometimes you want failures to be ignored, and other times you want to be very sure that a command doesn’t fail.
The standard bash answer to this question is that you should always check for errors if you care about them. But sometimes commands fail that you don’t expect to fail. What happens when you’ve left out a test in that case because that could never happen, right? Or perhaps you have taken extreme care to check for errors in all possible cases. Now your code is long, complex and very difficult to follow.
Solution
With ebash, we have decided to make scripts detect errors implicitly. If an error occurs, ebash will always detect it and your script will immediately exit with an informative stack trace with precise indication of the cause and location of the failure. You can specifically ignore such failures when you know it’s safe, but the default is to make failures as obvious as possible.
This is in opposition to all bash defaults. And, frankly, it’s a bit difficult to force bash into exiting on all failures. But we do our best with a multi-layered approach. As long as you use the typical ebash idioms and follow a small set of conventions, we believe that any time an error occurs in your ebash code, you’ll hear about it.
Multi-layered ebash solution
-
First, we use bash’s error trap functionality. This behaves just like
set -e
, except that we get to choose what runs when an error occurs. Basically, we call thedie
function on any error. This causes a stack trace to be generated which will have[UnhandledError]
in its message. - Second, we ensure that the error trap is inherited by all subshells, because sometimes you need them, and often it’s
difficult to know where there will actually be a subshell.
Sometimes bash really wants to ignore a return code. For instance, if command substitution isn’t part of the first
command on a line, as in both of the following lines of code, bash will ignore any return code of
cmd
.echo "$(cmd)" local var=$(cmd)
ebash enhances bash’s error detection to catch this case. The error trap is inherited by the shell inside the command substitution, and when it calls
die
, it will actually send a signal to its parent shell. When the parent shell receives that signal, it generates a stack trace and blows up, which is exactly what we were looking for. - Third, we establish coding conventions to avoid common pitfalls in bash that subvert error detection. These are in the next section.
Conventions
Avoid short-circuit ||
and &&
Typically, you can count on bash’s built in error detection (i.e. set -e
, which we enable by default) to blow up if a
command you run exits with a non-zero exit code. Sometimes that’s not what you want as some commands are expected to
fail. The most common is grep
which returns a non-zero error code if it has no matches. In that case, the right thing
to do is generally put an || true
after the command, as in:
grep pattern file || true
However, using ||
has an insidious effect when used around a bash function call. Specifically, it disables set -e
error detection for the entire call stack. For that reason, we avoid it whenever possible on bash functions. Notice
that this is perfectly save when calling external binaries.
Consider:
func()
{
command1
command2
}
func
When this code is executed, if command1
fails, then the script will blow up immediately and will not execute command2
.
In ebash, this is what we want and expect. But, given the same func
, the following commands all work very differently:
func && echo "success" || echo "failure"
if func; then ; something ; fi
while func; do ; something ; done
Because func
is part of a command list containing either &&
or ||
, bash turns off its error detection behavior for
all code that is called, not just the code written on this line. So what happens in this case is that func
runs just
as before and calls command1
, which fails. But since bash’s error detection behavior is off, it goes on to call
command2
(which would not be allowed with ebash). Suppose that command2
succeeds and returns 0
. The final return
value from func
would thus be 0
and the failure from command1
is masked!!
As you can see, this can dramatically affect the way code runs, and we’ve come across large issues that were masked by
this behavior. So our standard on these is to only use ||
or &&
on simple statements including only bash builtins
or external binaries or executables.
Use try
/ catch
to handle errors
Since ebash forces you to be explicit about when you’d like to ignore errors in your bash script, it also gives you ways
to do that. First off is the try
/ catch
construct. This looks frighteningly like the C++ construct.
try
{
cmd1
cmd2
}
catch
{
echo "Caught return code $?"
}
It also behaves a lot like you’d expect from familiarity with other languages. Code in the try block is executed until
the end of the block or until an error is encountered. At the point an error is detected, no further code in the block
will execute. So if cmd1
fails in the above block, cmd2
will never be executed.
The catch block is only executed if a failure is detected in the try block. So if either cmd1
or cmd2
fail, the catch
block will be executed. At the beginning of the catch block, the value of $?
will be the return code of the failing
command.
The most difficult caveat of using try
and catch
in ebash is that the code in the try
block is its own subshell.
This means that variable assignments that you make inside this block will not be seen outside. It’s impossible to
set a variable in a parent shell from its subshell. Note that this is only true of try
. Catch executes in the same
shell as surrounding code.
One workaround is to write data to a file inside the block and read it outside. Another alternative is tryrc
.
For more details about try
and catch
see the documentation in try and catch and the many examples in the
tests.
Use tryrc
to handle errors in simple commands
Sometimes you just need to run a single command and then perform different behavior depending on its return code. For
external commands, the bash if
statement or short circuit logical operators (||
and &&
) handle this nicely, but if
you’re calling bash functions it is a bad idea to use them in this way.
tryrc
is the ebash solution to this issue. It runs a function (or external command, basically any single statement bash
can execute) and captures its return code into a variable for later examination. Its simplest usage looks like this:
$(tryrc foo)
if [[ ${rc} -eq 0 ]] ; then
# Normal case
else
# Error case
fi
The tryrc
call will execute the foo
function or command, and save its return code in a new local variable (created by
tryrc
) named rc
. Error detection in code called by foo
is still enabled, but regardless of the return code, the
tryrc
statement will not fail. Then you’re able to handle the return code as you like.
tryrc
is also capable of capturing the output (stdout) and/or error (stderr) of the executed command into variables
that you specify at invocation time. Here’s are a couple more complicated examples:
$(tryrc -r=cmd_rc -o=cmd_out -e=cmd_err cmd)
$(tryrc --rc=cmd_rc --stdout=cmd_out --stderr=cmd_err cmd)
This invocation will execute cmd and create three local variables. cmd_rc will contain the return code of cmd. cmd_out and cmd_err will contain stdout and stderr, respectively. This gives you the functionality of command substitution along with the ability to choose what happens with errors.
This functionality is important, because it’s difficult to use redirection with commands that are executed in tryrc. If you need use redirection in way that can’t be solved by something like the following, it’s best to use try / catch instead of tryrc.
$(tryrc -o=out -e=err cmd)
echo "${out}"
echo "${err}" >&2
Further Details
- tryrc documentation
- Examples in tests