Our mission is to make Scala code compile fast. Thanks to Hydra, the world’s only multicore Scala compiler, we delivered on this promise. Speedups are enticing and range from 2x up to 7x (see our article about compiling Scala on different Amazon EC2 instances). This is great, but one nuance we realized over the years is that when developers are frustrated about slow compilation, they sometimes mean more than just compilation.
How can that be? The reason is that we never compile our code by invoking the Scala compiler directly. There is usually some other tool between us and the Scala compiler, and therefore compilation time is very often inflated by the specific tool at hand. In the Scala ecosystem, the tool we use to compile and build our projects is generally sbt (70%+ according to both the IntelliJ Scala 2019 survey and our State of Scala Compilation survey). Hence, understanding which inefficiencies are inherent to sbt and which depend on setup is key to minimize build times.
In this series of three articles about sbt we are going to cover the following aspects:
- In this first installment we lay the foundation to understand how sbt works. We cover what a setting is, and we build an intuition of why it takes long for sbt to load a build in memory (on large projects loading the build can easily take more than 30 seconds!). On the way, we learn about a few utilities that are readily available at our fingertips and that can brilliantly help us debug problems (say goodbye to
println
statements!). - In the second installment we compare build loading times on the CI versus local environment, with the goal of finally understanding why they are different. Armed with this knowledge, we can optimize our CI build times and obtain massive speedups!
- In the third and last installment we cover what happens after the build is loaded in memory, and we execute a task such as
compile
. You’ll learn how to profile sbt and understand what other tasks are triggered in reaction tocompile
in your build (the emphasis on your is key, as every build is different!).
Let’s get started by exploring the building blocks of a sbt build.
What’s in a sbt build?
In its very essence, a sbt build is just a collection of settings (we refer to both sbt Task
and Setting
type as settings). Settings are usually contributed by plugins, either declared in the global or the project plugins folder. A third possibility is that a build defines new settings, but this is somewhat less common in practice.
So, what’s in a setting?
Settings 101
The job of a setting is simply to generate a value. To generate a value a setting may depend on other settings. Therefore, to evaluate a setting its dependencies need to be evaluated first.
But how can we discover the dependencies of a setting? And how can we quickly pinpoint where in the code a specific setting is defined?
Turns out it’s very simple to answer both of these questions!
Dependencies of a setting
The dependencies of a setting can be easily found by leveraging the built-in inspect tree
command. Next is an example showing how to find the dependencies of the sourceDirectory
setting.
sbt:root> inspect tree sourceDirectory [info] sourceDirectory = src [info] +-baseDirectory = [info] +-thisProject = Project(id root, base: /projects/scalaworld, …)
Pretty neat!
Where is a setting defined?
Quite conveniently, the very same inspect
command can also help us find out where a setting is defined in the code. Doing so can be extremely useful when debugging a build. Let’s see where the sourceDirectory
setting is defined.
sbt:root> inspect sourceDirectory ...
[info] Defined at:
[info] (sbt.Defaults.paths) Defaults.scala:328 ...
We now know sourceDirectory
is defined in the sbt/Defaults.scala source file. And it’s in good company, as most sbt settings are defined in the very same source.
By all means, the inspect
command should be part of our sbt toolbox!
How many settings in a build?
Now that we know how to explore settings, we might be wondering how many settings are defined in a typical build. After all, we have the impression we use sbt just to compile
, test
or run
the application, so they shouldn’t be too many, should they?
As a little experiment, I checked out a few open source projects that I believe are good representative of a typical commercial project build, to see how many settings are defined in their build. The first and second are built using Play Framework, while the third and forth are well known open-source projects. Here is the result:
project@git-hash | settings |
---|---|
guardian/frontend@2ba8094 | 23793 |
ornicar/lila@0647f7f | 29712 |
akka/akka@bc4c6ba | 32047 |
circe/circe@8eb2cd5 | 37007 |
There are about 30k settings defined in these projects! The immediate follow-up question is why are there so many settings?
Before we delve into that, let’s have a brief detour and see how we can find how many settings are defined in our build.
Finding the settings size
When you start sbt, pay attention to its output. If there are more than 10k settings in your build the exact number is printed on screen.
$ sbt
[info] Loading ...
[info] Resolving key references (32047 settings) ... // <-- Printed if 10k+ settings
...
But here I actually wanted to take the opportunity to advertize the sbt consoleProject
task, which is unfortunately not widely known but can be a huge time saver when debugging a build. What does it do? It’s a Scala REPL with your build definition loaded in it. If you like using the Scala REPL for quick code exploration, you’ll love consoleProject
.
For instance, we can use consoleProject
to retrieve the settings size of your build in case it’s not printed by sbt on start. Mind that eval
is akin to calling .value
to trigger evaluation of a setting inside a build.
$ sbt
... akka > consoleProject
[info] Starting scala interpreter…
...
scala> buildStructure.eval.settings.size
res0: Int = 32047
Next time you are thinking of adding a println
in the build to debug an issue, don’t! Instead, query the sbt internal state thanks to consoleProject
!
Now, let’s get back on track and build an intuition of why there are so many settings in a build.
Why so many settings?
Turns out there are many settings in addition to just compile
, test
, and run
. In fact, even an empty build with no global nor project plugins has already around 700 settings defined in it! (if you don’t believe me you now know how to check this out on your own ;-)).
The number of settings in a build quickly grows with the number of plugins and projects defined in it. The ultimate reason for the explosion are the many scoping axes a setting can be bound to.
In fact, a setting can return a different value depending on the Configuration/Project/Task it’s bound to. You can visualize this as a 3D matrix or a cube.
For instance, the sourceDirectory
for the Compile
and Test
scopes are different:
sbt:root> show root / Compile / compile / sourceDirectory [info] .../src/main
sbt:root> show root / Test / compile / sourceDirectory
[info] .../src/test
But not all tasks makes sense for all scopes. As an example, the run
task is never used on a library project, but it’s nevertheless always added to it.
The high degree of flexibility offered by sbt makes it easy to define settings. But this comes with a cost: the more settings we have, the longer it takes to load a build.
Get ready for more
While settings play an important role in the build load time, there is more to it. We are going to fully cover the build load time (and how you can optimize it!) in the second installment of this series of articles about sbt.