CI builds are notorious for taking a long time and using a lot of resources. However, there are ways to speed up the builds and use less resources. James Ward, Google Developer Advocate at the time, and Mark Vieira, a Principal Software Engineer at Elastic, co-presented a Webinar on how Elastic saves time and spends less on their Google Cloud infrastructure by leveraging Develocity.
In the session, Mark details how the Elasticsearch team reduced their build times, resulting in a ~20% decrease in GCP compute resources and saving them hundreds of thousands of dollars per year.
Speedier local builds
The Elasticsearch team consists of about 70 developers. Their local Java builds typically take 90-120 minutes to run the whole suite of integration and backward compatibility tests. But it can take considerably longer when they run the whole build matrix of supported platforms.
By taking advantage of the Develocity Build Cache on Google Cloud, the builds run faster, using less resources and allowing for engineers to get feedback sooner. How? The Build Cache allows developers to share and reuse unchanged build and test outputs across the team. This speeds up both local and CI builds, since cycles are not wasted re-building components that are unaffected by new code changes.
CI build agent hours saved
The Elastic core engineering team runs 32,200 CI jobs per week. The Develocity remote cache saves an average of 19-minutes and 37-seconds of build execution time per CI build/job. This translates into approximately 200,000 build agent hours saved per year.
Additionally, the Elastic core engineering team runs 5,970 Pull Request builds per week. The remote shared cache saves an average of 11-minutes and 38-seconds of build execution time per PR build. This translates into….
Less compute resources required on dev branch merges
Elastic uses the Gradle Build Cache for all builds that run when merges are made to the dev branch. According to Mark: “On average, every time someone pushes something to our branch, those builds are roughly 39% faster because of the Build Cache. This directly translates to 39% less compute resources that would otherwise be used to run those builds.”
Shortened build execution times on CI
The Develocity remote shared cache saves Elastic an average of one hour of build execution time per build—and they run approximately 1,111 each week.
Advantages of running CI jobs on dynamically allocated agents
Elasticsearch runs all CI jobs on dynamically allocated agents on Google Cloud. Mark suggests three key advantages to doing so:
- Scaling is very flexible in terms of how many agents are needed when running 10, 100, or 1,000 concurrent builds. Elastic enjoys the benefits of Google Cloud’s infinite scalability while not paying for unused resources—since unused agents are immediately shut down when not in use.
- Google Cloud enables workload optimized performance. This means that the agents that jobs run can be tailored to optimize resource utilization depending on the configuration and types of workloads of those builds (e.g., I/O-bound vs CPU-bound vs memory-bound).
- Agents can be made ephemeral. So for instance, after running a build on a completely clean machine pre-loaded with dependencies and other artifacts, the agents disappear. This is useful for weeding out odd build failures and tests that are difficult to clean up after—such as those that involve installing services that modify the underlying machine they are running on.
In the session, Mark also provides details on how Elastic uses other features of Develocity including Build Scan®, Failure Analytics, and Management Reporting & Insights to improve the quality and reliability of their builds.
It’s exciting to see how Elasticsearch leverages a combination of Google Cloud services and Develocity to provide its developers with faster feedback cycles while also reducing costs. To dig even deeper, check out the webcast: Dev Cloud Acceleration at Elastic with Develocity and GCP.