SBT is probably the most popular (as in “most used”) build tool for Scala yet people (including me) experienced a hard time figuring out why it doesn’t perform as they expect.
Originally known as the “Simple Build Tool” it has later been renamed to “Scala Build Tool” (maybe to acknowledge the fact that it was not so simple?)
Part of the reason is that it’s not really intuitive. True, there is documentation available but how many developers don’t bother reading it (at least until the effort overcomes the pain of using SBT) plus it’s much easier to copy/paste code snippets without completely understanding what they do.
That being said SBT is not magic, so let’s try to understand how it works so that next time we’re not reduced to copy/paste cryptic code snippets until it works.
A build DSL
If you look at a simple build.sbt
file it may look something like this:
organization := "my.company" name := "demo" licenses += "Apache-2.0" -> url("http://www.apache.org/licenses/LICENSE-2.0") libraryDependencies += "org.typelevel" %% "cats-core" % "1.0.0-RC1"
At first glance it looks similar to a Maven POM file but simpler as it doesn’t use the more verbose XML syntax.
We can find similar things like
- declaration of the project name and organisation
- declaration of the dependencies
- …
So if it’s not XML what language is it? Well, this is plain Scala code. In fact this is actually a Scala DSL for describing a build and, of course, as it’s Scala code it has to be compiled before you can build your project
This brings up to the lifecycle of the build:
- Compile the build project
- Execute the build project to create the build definition
- Interpret the build definition to create a task graph for the build
- Execute the task graph to actually build your project
Let’s get back to each of them in more details:
Compiling the build project
This is where the build.sbt
file is compiled (Remember that although it looks like a declarative language this is actually plain Scala code and therefore has to be compiled).
There is a special project (called project
) which can also contain some Scala code. project
code is part of the build itself and not of your project code (It starts to get confusing, doesn’t it?).
This is typically in this folder that you add the plugins needed for the build. You can also declare your dependencies in this folder or any custom tasks (more on that later).
project
is itself recursive. You can nest a project
folder inside project
and add some code (E.g. dependencies declaration) in it. This will be compiled before compiling the parent project.
Basically the project
is a build inside your build that knows how to build your build. And it’s recursive so project/project
is a build that knows how to build the build of your build (as explained here).
A build compilation failure (not the project compilation) will prevent SBT to start with this kind of error:
[error] (*:update) sbt.ResolveException: unresolved dependency: xxx#yyy;x.y.z: not found Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?
In this case it indicates a problem with the dependencies of the build itself (and failed because it couldn’t compile the build project itself).
This shouldn’t be mistaken with a problem with your project dependencies. A project dependencies issues won’t prevent SBT to start. Here you should have a look of dependencies of the build itself (e.g. check the plugin declarations).
The difference between the build.sbt
and a scala file is that build.sbt
already import the following
import sbt._ import Process._ import Keys._
This allows you to start defining your build straight away. When using a scala file you need to explicitly import them if you need.
Execute the build project to create the build definition
I believe this is the most confusing stage as this phase just doesn’t exists in the Maven world. In Maven it’s simple you parse the XML to create the build definition and then you execute the build.
Here it’s different we need to execute the Scala code in the build to create the build definition and only then we can run the build itself.
Moreover setting up the build is done in 2 steps:
- Compile the scala code to create the build definition (a set of projects containing some settings)
- Interpret the build definition to create the “build execution plan” (a task graph that models dependencies between tasks)
Only then the task graph can be executed to run the build.
But let’s get back to what the build definition is. Simply put the build definition is just a list list of key-value pairs. And as SBT support multi-projects (a.k.a. submodules) these key-value pairs are grouped by project.
lazy val root = (project in file(".")) .settings( name := "demo", scalaVersion := "2.12.4" )
If you have a single project there is no need to declare the root project you can place your key-value pairs directly in the build.sbt
.
These key-value pairs are known as Setting
. Setting keys are typed and can there are 3 different types:
SettingKey[T]
evaluated once when SBT starts (or withreload
command)TaskKey[T]
is evaluated every time a command is runInputKey[T]
is for tasks taking arguments (e.g. testOnly *.SomeSpec)
The value itself is known as the task body and needs to return a value of type T
(the same type as declared in the key). The body can contain any Scala code.
Of course there are many predefined keys but it’s also possible to define your own.
You can find more information on this phase over here.
Interpret the build definition to create a task graph for the build
We’ve seen that SBT provides a DSL to defined tasks (Setting
can be considered as a Task
that runs only once).
This makes SBT a Task engine but in order to run the tasks correctly (i.e. in the right order) SBT must analyse the dependencies between tasks.
Let’s take an example. Imagine a task “buildInfo” that generates Scala code (a Scala object) with some information about the build. (This is a very basic example – there is a real plugin for that: https://github.com/sbt/sbt-buildinfo)
version := "1.0.3" // define the task key lazy val buildInfo = taskKey[Seq[File]]("Generates basic build information") // define the task body buildInfo := { val f = sourceManaged.value / "BuildInfo.scala" val v = version.value val i = java.time.Instant.now() IO.write(f, s""" |import java.time.Instant | |object BuildInfo { | val version: String = "$v" | val time: Instant = Instant.ofEpochMilli(${i.toEpochMilli}L) |} """.stripMargin ) // returns a Seq[File] as declared in the key f :: Nil } // add the task to the list of source generators sourceGenerators in Compile += buildInfo
Inside the task body there are 2 variables f
and v
that actually depends on other tasks (or settings). You can see that to retrieve the values for the version
and sourceManaged
folder we couldn’t use them directly (because they are tasks and not values) so we have to call .value
to retrieve the value of the task.
This .value
is the trick used by SBT to build the dependency graph. Behind the scenes it triggers a macro that allows SBT to lift the dependencies outside of the task body.
You can query the dependency graph by using the inspect tree
command.
sbt:demo> inspect tree buildInfo [info] *:buildInfo = Task[scala.collection.Seq[java.io.File]] [info] +-*:sourceManaged = target/scala-2.12/src_managed [info] | +-*:crossTarget = target/scala-2.12 [info] | +-*/*:crossPaths = true [info] | +-*:pluginCrossBuild::sbtBinaryVersion = 1.0 [info] | | +-*/*:pluginCrossBuild::sbtVersion = 1.0.3 [info] | | [info] | +-*/*:sbtPlugin = false [info] | +-*/*:scalaBinaryVersion = 2.12 [info] | +-*:target = target [info] | +-*:baseDirectory = [info] | +-*:thisProject = Project(id tmp, base: /private/tmp, configurations: List(compile, runtime, test, provided, optional), plugins: List(<none>), autoPlugins: List(sbt.plugins.CorePlugin,.. [info] | [info] +-*:version = 1.0.3 [info]
We can see that buildInfo
depends on sourceManaged
and version
.
The official documentation is here.
Execute the task graph to actually build the project
SBT is now ready to execute a task by running as many dependent tasks in parallel as it could (according to the graph). If task C depends on tasks A and B (and A and B don’t depend on each other) then SBT can run A and B in parallel and then C.
So far we’ve covered quite a lot of grounds. We’ve learned the lifecycle of a build, how to define tasks or settings and how to check the dependencies of a task.
By now we should start to feel more confortable working with SBT but there is still something quite puzzling that we need to understand to leverage all the power of SBT: scopes.
Scopes
So far we’ve seen that a setting or task key is always linked to a single value or task body. In fact it’s not entirely true. A key also depends on a context.
E.g. The project name (key name
) is different inside every sub-project. So can be the scala version (scalaVersion
), …
SBT calls these context dependencies: Scope axes. There are 3 different axes needed to fully qualified a task:
- The project axis
- The configuration axis
- The task axis
The project axis
This is the axis we used as an example. You can obviously use different values (tasks) across sub-projects so the project is one of the axis.
The configuration axis
Similarly inside a project the sources
are different when you compile the project or when you compile the tests. The sources
task depends on the “Configuration” (Probably not the most suitable name but this is the terminology used in SBT).
You can think of the configuration axis as something similar to the maven scope (Remember when we specify that a dependency is already provided
or only used in test
).
In addition there is also a relation between the configurations. E.g. the Test
configuration extends the Runtime
configuration which extends the Compile
configuration. (It kind of makes sense because when you run the tests you want everything available at runtime plus the things needed for testing).
The task axis
The last axis is the task axis. This is when the value depends on the currently running task.
E.g. there are 3 packaging tasks: packageBin
, packageSrc
, PackageDoc
. They all depends on packageOptions
but we can have different values for packageOptions
for any of these 3 tasks. This is done by using the task axis.
Specifying scopes
When you create a task (or setting) in your build it is always scoped even when you don’t specify any axes. In this case SBT uses the following scopes by default:
- the project axis is set to the current project
- the configuration axis is set to
Global
- the task axis is set to
Global
But what is this Global
scope? Well, you can think of it as the default or fallback scope. If there is no value defined for the current scope SBT will try to find a value using a more general scope until it reaches the Global
scope.
This brings up to the scope resolution topic. Remember that we have 3 different scopes available: project, configuration and tasks. In theory you can combine any possible values along these 3 axes to define a task. Think of it like a cube or 3d-matrix. That’s a log of possible values but in practice many of them don’t make any sense at all. So there is no need to define a task body for these “impossible” combinations.
Then there are several scopes that actually need to use the same task body or value. In this case you want to define the task only once. Typically for the most generic scope and having the more specific scopes fall back to it when SBT tries to resolve the task scope.
This is how the scope resolution (or scope delegation) works:
- On the task axis: if the task is undefined under the current task fallback to
Global
- On the configuration axis: if the task is undefined under the current configuration, tries the parent configuration (if no parents fallback to
Global
). - On the project axis: if the task is undefined under the current project, tries the special scope
ThisBuild
, then theGlobal
scope. - If several scopes are resolved the project axis takes precedence over the configuration axis over the task axis
More information and examples can found in the official documentation.
Something that might look like a recursive task is actually not recursive but delegating to a more generic scope:
lazy val lines = settingKey[List[String]]("Demonstrate scope delegation") // the initial list lines in (Global) := "line in scope (*,*,*)" :: Nil // prepend to the initial list lines := "line in scope(ThisBuild,*,*)" :: lines.value
If you run this in SBT you’ll get:
sbt> lines [info] * line in scope(ThisBuild,*,*) [info] * line in scope (*,*,*)
lines.value
on the last line resolved to the lines defined in the global scope (*,*,*)
(*
means Global
and {.}
means ThisBuild
).
As you can see resolving a task might be rather tricky. Fortunately SBT provide some help with the inspect command
sbt:tmp> inspect lines [info] Setting: scala.collection.immutable.List[java.lang.String] = List(line in scope(ThisBuild,*,*), line in scope (*,*,*)) [info] Description: [info] Demonstrate scope delegation [info] Provided by: [info] {file:/tmp/}tmp/*:lines [info] Defined at: [info] /tmp/build.sbt:3 [info] Delegates: [info] *:lines [info] {.}/*:lines [info] */*:lines [info] Related: [info] */*:lines
You can invoke a specifically scoped task or setting using SBT command line by specifying the scope as follow:
sbt> project/Configuration:Task::command
Of course you don’t have a complete scope, it’s possible to omit any of the scope axes.
Chaining tasks
So far we’ve seen that we can use the value returned by a task into another task. It works fine but it might be hard to manage. Consider the case where you have a task that depends on 2 other tasks.
lazy val A = taskKey[Unit]("Task A") A in Global := { println("Task A") } lazy val B = taskKey[Unit]("Task B") B in Global := { println("Task B") } lazy val C = taskKey[Unit]("Task C") C := { A.value B.value }
When you run task C, tasks A and B run too (because C depends on both of them). However A and B don’t depend on each other so SBT runs them in parallel. There is nothing wrong with it but it might not be what you want.
Sequential tasks
Let’s change the definition of A to make it fail.
lazy val A = taskKey[Unit]("Task A") A in Global := { println("Task A") throw new Exception("Oh no!") }
You may think that B runs only when A succeed but it’s not the case. For that we need to run the tasks in sequence. This is done by using Def.sequential
.
C := Def.sequential(A, B).value
Rewiring with dynamic task
Dynamic task (Def.taskDyn
) can also be used to chain tasks together. It allows to return a task instead of a value.
C := (Def.taskDyn { val a = A.value Def.task { B.value a } }).value
We execute task A and then a task that executes B and return the result from A. What’s cool is that since we return the result from A we can rewire A directly:
A := (Def.taskDyn { val a = A.value Def.task { b.value a } }).value
For a more concrete example you can check the documentation where compile
is rewired to run both compile
and scalastyle
(and still returning the result of compile).
Commands
If you find yourself having to type too many sbt commands to perform only one operation you probably want to group them together using Command
.
Imagine that you’re using both scalastyle and scalafmt on your project and you want to run them across all configurations (compile, test, it). That’s 6 commands to type in. You can easily group them together:
commands += Command.command("validate") { state => "compile:scalastyle" :: "test:scalastyle" :: "it:scalastyle" :: "compile:scalafmt::test" :: "test:scalafmt::test" :: "it:scalafmt::test" :: "sbt:scalafmt::test" :: state }
Conclusion
This pretty long post on SBT should have cover enough to make you understand most of SBT’s cryptic syntax. Hopefully you should now be ready to understand most of SBT files and no longer afraid to try and experiment for yourself.
SBT plugins are obviously missing but there’re nothing more than some code to create additional settings on your projects (and you should now be able to understand that too).