Building a Monorepo with Python
Table of Contents
This article explains how to set up a monorepo in Python. Earthly efficiently orchestrates complex builds in monorepos. Check it out.
Many software organizations opt to create and maintain repositories based on individual projects, applications, or teams. While this approach allows for full autonomy over each project, it often results in isolated projects that impede cross-team collaboration, particularly as the organization grows and adds more projects or services.
That’s why many have begun to opt for a monorepo setup, where a single repository contains the entire codebase for the organization. Monorepos are beneficial for several reasons. They help increase collaboration across teams, ensure unified build pipelines, and help reduce duplication.
However, creating a monorepo can be complicated, specifically in Python. That’s why, in this article, you’ll learn more about monorepos in Python—including how to put one together using Earthly, a build tool designed for managing monorepos.
How to Build a Monorepo With Python
To help you better understand how to build a monorepo Python project, let’s consider a real-world use case where you, the developer, are building a health and fitness application that lets its users calculate their body mass index (BMI) and their daily calorie intake.
All the code for this article is available in this GitHub repository.
The monorepo setup for this application consists of multiple components, including services and packages that are developed by different teams as independent components but are still shared and managed within the same repository.
The application consists of two backend services and three shared packages, and the project structure is as follows:
.
├── README.md
├── health_fitness_app
│ ├── __init__.py
│ ├── bmi_service
│ │ ├── __init__.py
│ │ ├── bmi_service.py
│ │ └── test_bmi_service.py
│ ├── calorie_intake_service
│ │ ├── __init__.py
│ │ ├── calorie_intake_service.py
│ │ └── test_calorie_intake_service.py
│ ├── main.py
│ └── packages
│ ├── bmi
│ │ ├── __init__.py
│ │ ├── bmi_calculator.py
│ │ └── test_bmi_calculator.py
│ ├── bmr
│ │ ├── __init__.py
│ │ ├── bmr_calculator.py
│ │ └── test_bmr_calculator.py
│ └── calorie
│ ├── __init__.py
│ ├── calorie_calculator.py
│ └── test_calorie_calculator.py
└── requirements.txt
The two services and three packages are part of a single repo. The packages
directory is a shared space for all custom-implemented packages that can be shared between the two services (or any number of services that are added to the application in the future).
The first service, bmi_service
, calculates the BMI of the user with weight and height inputs. It uses the methods defined in the bmi
package in the shared packages
directory.
The code for bmi_service
looks like this:
from health_fitness_app.packages.bmi.bmi_calculator import (
calculate_bmi,
get_bmi_category,
)
def cal_bmi(weight, height):
"""To get your bmi enter your weight(kg) and height(m)"""
= calculate_bmi(weight, height)
bmi = get_bmi_category(bmi)
bmi_category return {"bmi_value": bmi, "bmi_category": bmi_category}
The second service, calorie_intake_service
, calculates a user’s daily calorie intake requirements using the following input provided by the user: weight
, height
, age
, sex
, and activity_level
. This service also uses two shared packages, bmr
and calorie
, to calculate the basal metabolic rate (BMR) value and calorie intake for the user.
Here’s the code for calorie_intake_service
:
from health_fitness_app.packages.bmr.bmr_calculator import calculate_bmr
from health_fitness_app.packages.calorie.calorie_calculator import (
calculate_calorie_intake,
)
def cal_calories(weight, height, age, sex, activity_level):
"""To get your bmr and daily calorie intake enter your: weight(lbs),
height(in), age(years), sex(male/female), and
activity_level(sedentary/lightly active/moderately active/very active)"""
= calculate_bmr(weight, height, age, sex)
bmr = calculate_calorie_intake(bmr, activity_level)
calories
return {"daily_calorie_intake": calories}
bmi
, bmr
, and calorie
are custom packages residing in the packages
directory. They contain bmi_calculator.py
, bmr_calculator.py
, and calorie_calculator.py
, respectively.
The codebase for bmi_calculator
looks like this:
# package containing methods for BMI calculations
def calculate_bmi(weight, height):
try:
= float(weight)
weight = float(height)
height = weight / (height**2)
bmi return int(bmi)
except ValueError:
return None
def get_bmi_category(bmi):
if bmi is None:
return "Invalid input"
elif bmi < 18.5:
return "Underweight"
elif 18.5 <= bmi < 25:
return "Normal weight"
elif 25 <= bmi < 30:
return "Overweight"
else:
return "Obese"
The codebase for bmr_calculator
looks like this:
# package containing methods for BMR calculations
def calculate_bmr(weight, height, age, sex):
try:
= float(weight)
weight = float(height)
height = int(age)
age = str(sex)
sex if sex == "male":
= 66 + (6.3 * weight) + (12.9 * height) - (6.8 * age)
bmr else:
= 655 + (4.3 * weight) + (4.7 * height) - (4.7 * age)
bmr return int(bmr)
except ValueError:
return None
The following is the codebase for calorie_calculator
:
# package containing methods for daily calorie intake calculations
def calculate_calorie_intake(bmr, activity_level):
try:
= str(activity_level)
activity_level if activity_level == "sedentary":
= bmr * 1.2
calories elif activity_level == "lightly active":
= bmr * 1.375
calories elif activity_level == "moderately active":
= bmr * 1.55
calories else:
= bmr * 1.725
calories return int(calories)
except ValueError:
return None
At this point, you can already start to see some of the advantages that come with a monorepo setup, including reusable code, consistent tooling, easier integration, and increased collaboration opportunities.
In the upcoming sections, you’ll learn how to use a couple of build tools (Pants and Earthly) to help you with monorepo management.
Monorepo Management With Pants
A build tool helps you run tests, fix linting issues, containerize your application, and create builds that would otherwise be challenging and time-consuming. There are several popular build tools available, including Pants, Bazel, Buck, and Earthly.
Pants is a popular monorepo management tool that is fast, user-friendly, and scalable. It supports Python, Java, Scala, Kotlin, Go, and Docker.
For a more in-depth tutorial on using Pants for Python projects, check out this Earthly article.
Initializing Pants
To initialize Pants as a project, navigate to your project root directory and run the following command:
pants
Executing this command creates hidden folders that Pants uses and a pants.toml
file in which the configuration of the projects is defined.
Paste the following into your project’s pants.toml
file:
[GLOBAL]
pants_version = "2.18.1"
backend_packages.add = [
"pants.backend.python",
"pants.backend.python.lint.black",
"pants.backend.build_files.fmt.black",
"pants.backend.python.lint.docformatter",
"pants.backend.python.lint.flake8",
"pants.backend.python.typecheck.mypy",
]
[anonymous-telemetry]
enabled = false
[source]
root_patterns = ["/"]
[python]
interpreter_constraints = [">=3.9.*"]
[python-bootstrap]
search_path = [
"/usr/bin/python3",
]
Make sure you create a .flake8
file in your project’s root directory with the following code:
[flake8]
extend-ignore:
E203, # whitespace before ':'
E231, # Bad trailing comma
E501, # line too long
This will prevent any configuration errors between the different linters you’ll be using later on.
Setting Up BUILD Files
Pants uses BUILD
files to store metadata for each application or module that’s created in each directory within the project.
To initialize the BUILD
files, run the following command:
pants tailor ::
This initializes a BUILD
file in each of the directories within the project, including the root.
The project structure after initialization looks like this:
├── BUILD
├── README.md
├── health_fitness_app
│ ├── BUILD
│ ├── __init__.py
│ ├── bmi_service
│ │ ├── BUILD
│ │ ├── __init__.py
│ │ ├── bmi_service.py
│ │ └── test_bmi_service.py
│ ├── calorie_intake_service
│ │ ├── BUILD
│ │ ├── __init__.py
│ │ ├── calorie_intake_service.py
│ │ └── test_calorie_intake_service.py
│ ├── main.py
│ └── packages
│ ├── bmi
│ │ ├── BUILD
│ │ ├── __init__.py
│ │ ├── bmi_calculator.py
│ │ └── test_bmi_calculator.py
│ ├── bmr
│ │ ├── BUILD
│ │ ├── __init__.py
│ │ ├── bmr_calculator.py
│ │ └── test_bmr_calculator.py
│ └── calorie
│ ├── BUILD
│ ├── __init__.py
│ ├── calorie_calculator.py
│ └── test_calorie_calculator.py
├── .flake8
├── pants.toml
└── requirements.txt
As you can see, seven BUILD
files are created. Each BUILD
file contains targets for both non-test and test files:
# This target sets the metadata for all the Python non-test files
# in this directory.
python_sources(
name="lib",
)
# This target sets the metadata for all the Python test files
# in this directory.
python_tests(
name="tests",
)
You can refer to the GitHub repo to see how each of the BUILD
files should be set up for this project.
Checking for Build Errors
Before you move on to the next step, make sure you run the following command to see if there are any errors in the setup of your project:
pants tailor --check ::
If you don’t get an output, your project setup is ready to go.
Running Project Tests
Use the following command to run the unit tests defined for your project:
pants test ::
Your output should look like this:
Fixing Linting and Formatting Issues
Pants supports many of the popular linting and formatting tools for Python, including Flake8, Black, and docformatter.
To activate any linter or formatter, all you need to do is add a backend configuration in your pants.toml
file. For instance, to run the linter, execute the following command:
pants lint ::
This lists all the linting issues in your project:
To fix any linting or formatting issues in your project, execute the following command:
pants fmt ::
Your output should look like this:
To fix the linting issues, run pants lint ::
again:
Creating a Pants Package and Running the Application
Even though Python is not a compiled language and does not require a build, you can still package and build your project to easily maintain your code, isolate dependencies, and effectively share it with others.
To create a Pants build, run the following command:
pants package health_fitness_app/main.py ::
This creates a pex_binary.pex
file under dist/health_fitness_app
:
├── dist
│ ├── health_fitness_app
│ │ └── pex_binary.pex
│ ├── health_fitness_app.bmi.bmi_calculator-0.0.1-py3-none-any.whl
│ ├── health_fitness_app.bmi.bmi_calculator-0.0.1.tar.gz
│ ├── health_fitness_app.bmr.bmr_calculator-0.0.1-py3-none-any.whl
│ ├── health_fitness_app.bmr.bmr_calculator-0.0.1.tar.gz
│ ├── health_fitness_app.calorie.calorie_calculator-0.0.1-py3-none-any.whl
│ └── health_fitness_app.calorie.calorie_calculator-0.0.1.tar.gz
Execute the following command to run your application:
pants run health_fitness_app/main.py
Your output should look like this:
{'bmi_value': 24, 'bmi_category': 'Normal weight'}
{'daily_calorie_intake': 1808}
Monorepo Management With Earthly
Now that you know how to use Pants for monorepo management, it’s time to see how Earthly differs. As you now know, Pants supports Python, making it a suitable choice for large Python-based monorepo projects. It offers features like fine-grained caching for accelerated builds and static analysis for dependency resolution. However, it lacks support for JavaScript and Rust and primarily focuses on build and test steps within workflows.
On the other hand, Earthly supports a wide range of languages, including JavaScript, Python, Java, C++, Go, and Rust, making it well-suited for multilanguage monorepos. Embracing a containerized model often likened to “Docker for builds,” Earthly enables the execution of various build tools compatible with Linux environments.
Setting Up Your Monorepo with Earthly
Earthly uses an Earthfile
to manage each service or package. The following is a list of the various components in the application:
.
├── Earthfile
├── health_fitness_app
│ ├── __init__.py
│ ├── bmi_service
│ │ ├── __init__.py
│ │ ├── bmi_service.py
│ │ └── test_bmi_service.py
│ ├── calorie_intake_service
│ │ ├── __init__.py
│ │ ├── calorie_intake_service.py
│ │ └── test_calorie_intake_service.py
│ ├── main.py
│ └── packages
│ ├── bmi
│ │ ├── __init__.py
│ │ ├── bmi_calculator.py
│ │ └── test_bmi_calculator.py
│ ├── bmr
│ │ ├── __init__.py
│ │ ├── bmr_calculator.py
│ │ └── test_bmr_calculator.py
│ └── calorie
│ ├── __init__.py
│ ├── calorie_calculator.py
│ └── test_calorie_calculator.py
└── requirements.txt
An Earthfile
has a Docker-like syntax, so if you’re familiar with Docker, using it is easy.
Setting Up the Earthfile
The Earthfile
for your health and fitness app looks like this:
VERSION 0.7FROM python:3
WORKDIR /code
deps:RUN pip install --upgrade pip
RUN pip install wheel
COPY requirements.txt ./
RUN pip wheel -r requirements.txt --wheel-dir=wheels
SAVE ARTIFACT wheels /wheels
build:FROM +deps
COPY health_fitness_app health_fitness_app
SAVE ARTIFACT health_fitness_app /health_fitness_app
unit-tests:COPY +deps/wheels wheels
COPY +build/health_fitness_app health_fitness_app
COPY requirements.txt ./
RUN pip install --no-index --find-links=wheels -r requirements.txt
RUN pytest health_fitness_app
docker:COPY +deps/wheels wheels
COPY +build/health_fitness_app health_fitness_app
COPY requirements.txt ./
ARG tag='latest'
RUN pip install --no-index --find-links=wheels -r requirements.txt
ENTRYPOINT ["python3", "health_fitness_app/main.py"]
SAVE IMAGE python-earthly-monorepo:$tag
This Earthfile
contains four different sections, or targets: deps
, build
, unit-tests
, and docker
. Each of these targets can be executed independently via the command earthly +<target>
.
If you want to resolve your project dependencies, you can execute the following:
earthly +deps
Your output would look like this:
Creating the Project Build
You can create your project build via the following command:
earthly +build
This command creates your project’s build, and any artifacts created in the build can be used in other targets:
Executing Unit Tests
To execute unit tests for your services and packages, you can run the following command:
earthly +unit-tests
Your output should look like this:
Containerizing Your Project
Finally, if you want to containerize your project, you can do so with the following command:
earthly +docker
Your output will look like this:
If you navigate to Docker Desktop, you can see that Earthly successfully created a Docker image for the project:
Earthly provides a simpler way to manage a Python monorepo when compared to Pants. Earthly’s Dockerized approach, which utilizes an Earthfile
, allows you to define the project dependencies and individual build or test steps to easily containerize the application.
Conclusion
In this article, you learned all about monorepos and why you’d want to use one. You also learned how to build a monorepo in Python and how you can simplify monorepo management with two popular build tools: Pants and Earthly.
If your projects deal with containerized microservices, Earthly is an ideal tool, as it offers extensive capabilities through its Docker-like syntax and container-based approach. This facilitates the effortless creation of distinct builds for each service within your application, providing flexibility, quick build creation, and caching functionalities.
Earthly Cloud: Consistent, Fast Builds, Any CI
Consistent, repeatable builds across all environments. Advanced caching for faster builds. Easy integration with any CI. 6,000 build minutes per month included.