Dealing with errors

As a programmer we don’t really like to deal with errors. We like to focus on the happy path – the one that provides value – and deal with the errors later because … well, we have to do it!

However dealing with failures is crucial if we don’t want our program to stop on the first error it encounters.

But still we don’t want to mix the “clean” code of the happy path with the “dirty” error handling code. And in fact this is what exceptions were suppose to bring: A clean happy path in a try statement and the error handling code in a catch statement. You know everything clean and separated.

In Scala we have a rich type-system which gives us more options to handle errors. But before we start let’s be clear by what we mean by errors.

Not all errors are created equal

In a typical application which interacts with the external world (e.g. a user or another program or service) we can identify 3 different type of errors:

  1. Expected errors – We know these kind of errors are likely to happen as part of the normal operation of our program and they don’t prevent the program to function correctly. It can be invalid input (from a user or another system), a record not found in a database, a service we’re interacting with returning an error code, …
  2. Recoverable errors – These are errors that prevent our program to function correctly. These errors put our program into some sort of degraded mode with limited or no functionality. But as soon as the error resolves so does our program. It can be a connection failure to a remote system (e.g. a database) or a service unreachable …
  3. Fatal errors – These are errors we don’t expect and thus are not recoverable (because we don’t know what to do when they occur). And if we don’t know what to do it is preferable to kill the application and restart a new healthy instance rather than leaving the existing one in some unknown state. E.g Think of an out-of-memory error or a memory corruption …

Expected errors

The expected errors live in the user-land. The requested functionality cannot be fulfilled because of the provided data. It’s like ordering an ice-cream without having enough money to pay for it. In this case you’re not going to get your ice-cream and yet the ice-cream van functions perfectly. This kind of error handling can live very close to the happy path.

You certainly want to encode these kind of error in the return type and stay away from exceptions in this case.

// the request record might not exists
Option[T]
// or if you to return the error reason
Either[E, T]
// validation falls into this category as well
Validated[E, T]
// ADT can also be used to model these errors
sealed trait Result
case object NotFound extends Result
case class Found[T](value: T) extends Result

Using a monadic type gives you the flatMap operation and the ability to use for-comprehensions which is always welcome to write some neat code:

val result: Either[Error, Record] =
  for {
    key <- validate(input)
    record <- find(key)
  } yield record

One caveat is that the error types need to align otherwise you may end-up using leftMap (or left.map) just to turn the errors in a common type which will obfuscate the logic a bit (unless your functions already return a common error type).

ADT should be used with care because if you need different logic for each member of the ADT you’ll resort to pattern matching which may not read as nicely as a for-comprehension (if you need to chain actions).

Recoverable errors

These errors prevent the application to work normally but the functionality can resume once the faulty component recovers. It might be an unreachable host or a connection failure, a timeout, a third-party error, …

These errors are often materialised as Exceptions but they shouldn’t be left uncaught. It’s preferable to catch them as early as possible and turn them into a proper error type that is part of the method signature and avoid them in the rest of the application.

Most of the time we have to deal with exceptions because of some (java) libraries that we rely on.

But Exceptions are difficult to reason about because they’re an unexpected way to exit a function (like some sort of hidden control-flow).

Code reasoning is difficult enough with return types and adding another mechanism just makes things unbearable (This is especially true with RuntimeException which don’t even require a throw clause in their signature). Exceptions also propagates up the call stack potentially much further away than the calling code making it difficult to locate all the error handling code.

On the performance side there’s an impact as well. Creating an exception is an order of magnitude slower than creating a class instance. Why? Because it needs to capture the whole call stack at that moment.

So just stay away from exceptions as much as possible and when you have to use them catch them as soon as possible so that they don’t bubble up in your code.

So far so good but how to combine these recoverable errors with the expected errors in our return types? We can just stack them together into something like:

def get(key: K): Either[Error, Option[T]]

Option indicates that it might return a value or nothing (e.g. if the key doesn’t exists).
Either indicates that the call might fail.

Now if you need to combine this call with other actions in a monadic-way (e.g. in a for-comprehension) and you’re only interested in the value T it might quickly become unreadable (You’ll have to resort to monad transformers or tagless-final etc to keep your code simple). Also keep in mind that quite often those calls might be asynchronous adding another layer on top of them:

def get(key: K): F[Either[Error, Option[T]]]

A way to simplify the stack might be to move the None case into the error space (if your logic allows it):

sealed trait Error
case object StoreError extends Error
case object KeyNotFound extends Error
def get(key: K): Either[Error, T]

Fatal errors

We are now left with unforeseen errors. The ones we don’t know how to react to. In this case it’s preferable to kill the application and restart into a clean state.

In theory these errors are materialised by an uncaught exceptions that bubble up the call stack until they reach the main method and exist the program.

This is what happens if your program runs on the main thread – which is most likely not the case. Instead the exception still propagates up the call stack but reaches the top of the current thread (most probably a thread part of a thread pool). This indeed terminates the current thread but not the whole program.

You’re now left in an inconsistent state with one dead thread. The thread pool might restart a new thread (or not). Depending on the error it might also be difficult to recover from (e.g. OutOfMemoryError).

Note that the JVM has 2 options (-XX:+ExitOnOutOfMemoryError and -XX:+CrashOnOutOfMemoryError) allowing to exit/crash as soon as this error occurs.

What about other uncaught exceptions? Well Java provides an UncaughtExceptionHandler interface which can be register on a thread:

Thread.getCurrentThread().setUncaughtExceptionHandler(handler)

Note: Monix has a similar concept with UncaughtExceptionReporter available at the Scheduler level.

Such handlers/reporters turn very useful to deal with unexpected errors. This is the ideal place to log errors and shutdown the application.

Error channels

Now if your program interacts with some external resources you’re most likely using an asynchronous abstraction. It can be a Scala Future, a Monix Task or Cats IO, etc … which means you’re dealing with types like

F[Either[Error, T]]

where F is the asynchronous abstraction.

All these abstractions capture exceptions on their own. For instance a Future has 2 possible states when completed: Success or Failure. Similarly to Try, Success wraps a value while Failure wraps a Throwable.

IT means that we’re now left with 2 error channels:

  • The Left part of the Either.
  • The failed state of F

And we need to be clear on what kind of errors goes into which channel otherwise our error handling will quickly turn messy and difficult to follow.

Of course these abstraction offer no distinction between the different kind of errors we’re discussing here (How would they now?). It means that we have to recover from all recoverable errors and promote them into the user space (e.g. on the Left side of the Either) and leave the failed state only for unexpected errors.

getExternalResource(key) // returns a Future[T]
  .map(Right.apply)
  .recover {
    // promotes the error into the user space
    case e: ConnectException => Left(ResourceUnavailable)
    // etc ...
  }

Note that here the error ConnectException doesn’t propagate in the remaining of the code. Instead it is turned into our own error type (ResourceUnavailable). It allows to hide implementation details to the upper layers.

Now that we’ve made sure that a failed async contains only unexpected errors we can safely re-throw them when running our program so that it reaches our uncaught exception handler which will deal with it.

One last thing worth mentioning is that so far we have considered all unexpected errors as fatal. (Remember the idea is that if we don’t know how to recover from an error it’s better to treat it as fatal). However this is not how Scala sees things. Scala has its own definition of what a fatal error is and this is any of:

  • VirtualMachineError (except StackOverflowError)
  • ThreadDeath
  • InterruptedException
  • LinkageError
  • ControlThrowable

Why should you care? Because any of these errors aren’t captured by a Future or a Try. Instead they are uncaught and bubble up the call stack so you need to place the uncaught exception handlers accordingly.

Conclusion

People tend to focus mainly on the happy path because this is where the value of a program lies. However error shouldn’t be treated as an after-thought because a proper error handling strategy is key to keep the code organised and easy to maintain.

Hopefully this post should have given you some ideas on how to structure your application in order to provide efficient error handling and a concise happy path.

And of course don’t hesitate to use the comments to share some other error handling strategies that you find relevant.