Building a Monorepo with Bazel

11 minute read     Updated:

Siddhant Varma %     Siddhant Varma

A monorepo is perhaps what you would expect from the name: a single code repository for your entire codebase.

Wikipedia describes it as a decade-old software development strategy for storing all your code in a single repository, but you can also think of it as a higher-level architecture pattern for governing loosely tied applications. For instance, if you have a full-stack web application stored in one repository and an Android client in another, a monorepo would essentially wrap them in the same repository codebase.

Who Uses Monorepos and Why?

Google is one of the most notable adopters of the monorepo pattern, and companies like Dropbox, LinkedIn, and Uber use monorepos to manage their large codebases. This is because large-scale projects having little or no dependency on each other can be developed, tested, and built without bisecting them into smaller projects.

If you’re from a JavaScript or npm background, you can think of a monorepo as a project having a single package.json file for managing all your project dependencies. It also allows you to easily share code between multiple environments using isolated modules as published packages. You can configure a single bundler for performing unit tests, integration tests, and other configurations without worrying about language and ecosystem-specific configurations.

The Efficiency of Building a Monorepo with Bazel

Bazel is an open-source build tool developed by Google to give power and life to your monorepo. It’s similar to other build tools like Maven, Gradle, and Buck, but it has a number of advantages:

  1. Bazel supports multiple languages (Java, JavaScript, Go, C++, to name a few) and platforms (Linux, macOS, and Windows).
  2. It’s built with Starlark, a high-level language similar to Python that allows it to perform complex operations on binaries, scripts, and data sets.
  3. Even for large source files, Bazel is blazingly fast at building, as it caches previous work and rebuilds only the code that needs to be.

This article will walk you through the core concepts of Bazel and set you up for building and compiling your own monorepo in JavaScript.

Bazel Basics

First, some vocabulary.

Workspace

Bazel calls your top-level source file a workspace, which contains other source files in a nested fashion. Your workspace is what builds your entire software by taking a set of inputs and generating the desired output.

Packages

A package contains all your related files and dependencies and a file named BUILD. Subdirectories falling under a package are called subpackages.

Consider the following directory tree:

src/app/BUILD
src/app/core/input.txt
src/app/tests/BUILD

It has two packages: app, and a subpackage, app/tests, since both contain their own BUILD files. Note that app/core is not a package but a regular directory inside app.

Targets

Elements of a package are called targets, which can be categorized as files and rules.

Files can either be source files containing the code of a developer or generated files generated by Bazel according to a specific set of rules. A rule specifies the relationship between a set of inputs and outputs along with the necessary steps to derive the latter from the former.

Labels

The name of a target is its label, which uniquely identifies it and always starts with //.

@repo//app/main:app_binary

Each label has two parts: a package name (app/main) and a target name (app_binary).

Dependencies

Target X is considered a dependency for target Y, if Y needs X at build or execution time. The dependency relation produces a Directed Acyclic Graph (DAG) called a dependency graph, which is used to classify these dependencies further. You can read more about these types and their definitions.

Build Files

Build files contain the top-level program that describes the set of declared rules. These files are periodically updated with respect to changes in the dependencies being used.

Bazel Basic Commands

You can check if you have Bazel installed on your system by running the version command.

$ bazel --version
bazel 4.0.0

The primary build command builds your project’s targets, and analyze-profile can analyze your builds. You can also remove output files and close the server using the clean command. You can get a list of all these basic commands along with their use cases by running:

$ bazel help 
Usage: bazel <command> <options> ...

Available commands:
  analyze-profile     Analyzes build profile data.
...

How to Build a Monorepo with Bazel

Once you have the Node.js installed on your system, you can run the following command to install Bazel globally:

npm install -g @bazel/bazelisk 

You can also install iBazel to enable hot reloading. This lets you see your changes live in real time.

npm install --save-dev @bazel/ibazel
npm install --global @bazel/ibazel

Configuring the bazel.rc File

You can write build options in a bazel.rc file to apply them on every build. You can use the same settings for your project by creating a tools/bazel.rc file at the root of your Bazel workspace.

If you don’t want to share these settings, you can move out the .bazel.rc file to the root directory and add it to your .gitignore list instead. You can also personalize these settings locally by moving it in your home directory.

The following is a generic bazel.rc file that you can modify according to your needs:

###############################
# Directory structure         #
###############################

# Artifacts are typically placed in a directory called "dist"
# Be aware that this setup will still create a bazel-out symlink in
# your project directory, which you must exclude from version control and your
# editor's search path.
build --symlink_prefix=dist/

###############################
# Output                      #
###############################

# A more useful default output mode for bazel query, which
# prints "ng_module rule //foo:bar" instead of just "//foo:bar".
query --output=label_kind

# By default, failing tests don't print any output, it's logged to a
# file instead.
test --test_output=errors

###############################
# Typescript / Angular / Sass #
###############################
# Make TypeScript and Angular compilation fast by keeping a few
# copies of the compiler running as daemons, and cache SourceFile
# ASTs to reduce parse time.
build --strategy=TypeScriptCompile=worker --strategy=AngularTemplateCompile=worker

# Enable debugging tests with --config=debug
test:debug --test_arg=--node_options=--inspect-brk --test_output=streamed --test_strategy=exclusive --test_timeout=9999 --nocache_test_results

Adding the buildifier Dependency to Your Project

Buildifier is a formatting tool that ensures all BUILD files are formatted in a similar fashion. It creates a standardized formatting for all your BUILD and .bzl files. It also has a linter out of the box to help you detect issues in your code and automatically fix them. You can add the buildifier dependency to your project either using npm:

npm install --save-dev @bazel/buildifier

or Yarn:

yarn add -D @bazel/buildifier

You will need the following scripts inside your package.json file to run the buildifier:

"scripts": {  
        "bazel:format": "find . -type f \\( -name \"*.bzl\" -or -name WORKSPACE -or -name BUILD -or -name BUILD.bazel \\) ! -path \"*/node_modules/*\" | xargs buildifier -v --warnings=attr-cfg,attr-license,attr-non-empty,attr-output-default,attr-single-file,constant-glob,ctx-actions,ctx-args,depset-iteration,depset-union,dict-concatenation,duplicated-name,filetype,git-repository,http-archive,integer-division,load,load-on-top,native-build,native-package,out-of-order-load,output-group,package-name,package-on-top,positional-args,redefined-variable,repository-name,same-origin-load,string-iteration,unsorted-dict-items,unused-variable",  
        "bazel:lint": "yarn bazel:format --lint=warn",  
        "bazel:lint-fix": "yarn bazel:format --lint=fix"  
}  

Building/Compiling Code

In this example, you’ll start from a new empty directory and build and compile a simple Node.js application using Bazel. You’ll end up with the following structure:

  WORKSPACE     
  BUILD.bazel    
  es5.babelrc  
  app.js    
  package-lock.json  
  package.json    
  

Instead of manually configuring everything, you can use the following commands to get started:

npm init @bazel bazel_build_nodejs  

Or if you’re using Yarn:

yarn create @bazel bazel_build_nodejs  

The previous commands use @bazel/create under the hood to set up your monorepo with some minimal configurations. This means that it automatically creates package.json, WORKSPACE, and BUILD.bazel files for you.

The package.json is exactly how it’s created when you’re initializing any Node.js project using the npm init command. It contains some development time dependencies and some scripts through which you can run your build.

Also notice how it automatically adds buildifier to your project so you can avoid manually setting it up. This is only a starting point though, and you would need to manually set up a buildifier depending on the requirements of your project.

{  
    "name": "bazel_build_nodejs",  
    "version": "0.1.0",  
    "private": true,  
    "devDependencies": {  
        "@bazel/bazelisk": "latest",  
        "@bazel/ibazel": "latest",  
        "@bazel/buildifier": "latest"  
    },  
    "scripts": {  
        "build": "bazel build //...",  
        "test": "bazel test //..."  
    }  
}  
  

Let’s install these packages and a few more like Babel to transpile your JavaScript code.

npm install @babel/core @babel/cli @babel/preset-env  

A package-lock.json file will also be automatically created for you. Your WORKSPACE.bazel file should look like this:

# Bazel workspace created by @bazel/create 3.4.1  
  
# Declares that this directory is the root of a Bazel workspace.  
# See https://docs.bazel.build/versions/master/build-ref.html#workspace  
workspace(  
    # How this workspace would be referenced with absolute labels from another workspace  
    name = "bazel_build_nodejs",  
    # Map the @npm bazel workspace to the node_modules directory.  
    # This lets Bazel use the same node_modules as other local tooling.  
    managed_directories = {"@npm": ["node_modules"]},  
)  
  
# Install the nodejs "bootstrap" package  
# This provides the basic tools for running and packaging Node.js programs in Bazel  
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")  
http_archive(  
    name = "build_bazel_rules_nodejs",  
    sha256 = "a160d9ac88f2aebda2aa995de3fa3171300c076f06ad1d7c2e1385728b8442fa",  
    urls = ["https://github.com/bazelbuild/rules_nodejs/releases/download/3.4.1/rules_nodejs-3.4.1.tar.gz"],  
)  
  
# The npm_install rule runs Yarn anytime the package.json or package-lock.json file changes.  
# It also extracts any Bazel rules distributed in an npm package.  
load("@build_bazel_rules_nodejs//:index.bzl", "npm_install")  
npm_install(  
    # Name this npm so that Bazel Label references look like @npm//package  
    name = "npm",  
    package_json = "//:package.json",  
    package_lock_json = "//:package-lock.json",  
)  
  

It basically tells Bazel where to pull the tools for running your project and fetches all the required rules to create a build. You also need to tell Bazel to use auto-generated rules. Add the following line to the top of your BUILD.bazel:

load("@npm//@babel/cli:index.bzl", "babel")  

Let’s add a simple console statement inside app.js:

console.log('NodeJS Built using Bazel!');  

Next add the following code inside es5.babelrc to configure Babel for transpiling JavaScript code:

{  
    "sourceMaps": "inline",  
    "presets": [  
      [  
        "@babel/preset-env",  
        {  
          "modules": "systemjs"  
        }  
      ]  
    ]  
  }  

Finally, you need to tell Bazel how to take JavaScript inputs and convert them to transpiled or ES5 output. Add the following code inside BUILD.bazel file after the previous load statement:

babel(  
    name = "compile",  
    data = [  
        "app.js",  
        "es5.babelrc",  
        "@npm//@babel/preset-env",  
    ],  
    outs = ["app.es5.js"],  
    args = [  
        "app.js",  
        "--config-file",  
        "./$(execpath es5.babelrc)",  
        "--out-file",  
        "$(execpath app.es5.js)",  
    ],  
)  

Run the following command to build and compile your JavaScript code:

npm run build  

In case you run into an error, try renaming your WORKSPACE.bazel file to simply WORKSPACE. If all goes well, you should see something similar to the following screenshot on your terminal:

Terminal log of the build command

You will see bazel-out and a dist directory where your output files will be present.

Output directories

If you check inside dist/bin/app.es5.js, you should see your transpiled ES5 JavaScript code as shown:

System.register([], function (_export, _context) {  
  "use strict";  
  
  return {  
    setters: [],  
    execute: function () {  
      console.log('NodeJS Built using Bazel!');  
    }  
  };  
});  

Setting Up Continuous Integration

Bazel recommends using container environments like the ngcontainer Docker image for continuous integration (CI). You can easily add specific CI settings using the build:ci or test:ci prefixes to your bazel.rc file.

If you’re using CircleCI, you can use this example as a reference. If you’re using GitLab, you can set up CI in minutes using the following scripts:

variables:  
  BAZEL_DIGEST_VERSION: "f670e9aec235aa23a5f068566352c5850a67eb93de8d7a2350240c68fcec3b25" # Bazel 3.4.1  
  
build:  
  image:  
    name: gcr.io/cloud-marketplace-containers/google/bazel@sha256:$BAZEL_DIGEST_VERSION  
    entrypoint: [""]  
  stage: build  
  script:  
    - bazel --output_base output build //main/...  
  artifacts:  
    paths:  
      - bazel-bin/main/hello-world  
  cache:  
    key: $BAZEL_DIGEST_VERSION  
    paths:  
      - output  

The above scripts define the build outputs and cache directory and also ensures immutability. Luckily the GitLab team has a dedicated article on this for the best reference.

Downsides of Bazel and the Monorepo Pattern

The monorepo pattern is trendy these days, but there are some trade-offs you should be aware of. For a large and diverse team working on a monorepo, it might not be a great idea to expose every ounce of that codebase to novice developers.

Besides someone messing things up accidentally, keeping open access to all your config files, API keys, and so on might pose an issue from a security standpoint. On similar lines, you can understand why open-source projects aren’t living inside monorepos yet.

While Bazel definitely does some magic to ease out this pain for developers, it doesn’t have a large open-source community backing it yet. Having all your source code in one place could slow down the general process of approving pull requests and running the build scripts every now and then.

Bazel also promotes a strict demarcation between your dependencies and source code, while modern languages and frameworks have dedicated directories for bookkeeping dependencies. For instance, an npm project will always have its dependencies in a node_modules directory inside the root directory. Diverging away from that pattern can present a steep learning curve, or at minimum an uncomfortable change.

Conclusion

Due to better structured configurational files and multiple language support, Bazel is a viable option for your large multi-language project deployed on multiple platforms. It’s fast, and you can even optimize your slow builds using your own build cache. Google has tried and tested Bazel’s core features to validate its stability, and their extensive documentation is some compensation for the small community.

If you’d like to explore further, you can build your own React or Angular app using Bazel to see how it treats different environments of the same language. You can also try out their tutorials for different languages to get a bigger picture of how Bazel works. And if today’s the day you’re welcoming Bazel into your project, definitely take a moment to familiarize yourself with its documented best practices.

If the benefits of Bazel look promising but the downsides prevent you from adopting it, then take a look at Earthly. It supports monorepo and poly repos and has a gentler learning curve.

Siddhant Varma %
Siddhant Varma

Siddhant Varma is a frontend engineer who loves writing about tech and educating the community.

Categories:

Updated: