Backstage catalog-info.yaml Cataloger
Augments existing Lunar components with owner, domain, and tag metadata read from each repo's `catalog-info.yaml`. Fetches files via the GitHub Contents API on a schedule. Runs per component. Use when Backstage data lives in repo files, not in a Backstage server.
backstage-catalog-info to your lunar-config.yml:uses: github://earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.5
What This Integration Syncs
This integration includes 1 cataloger that sync data from your systems.
augment
For each existing Lunar component, fetches catalog-info.yaml
from the component's GitHub repo via the Contents API, parses
it (supports multi-document YAML), picks the matching
Component entity, and writes its owner, domain, and tags to
.components["$LUNAR_COMPONENT_ID"] in the Catalog JSON.
Repos with no catalog-info.yaml, an unparseable file, or no
matching Component are skipped silently.
0 3 * * *
How Catalogers Fit into Lunar
Lunar catalogers sync component metadata into your Lunar catalog from external systems or source code. They can run on a schedule or be triggered by code changes to keep your service registry up-to-date.
By automatically discovering components from GitHub organizations, service registries, or by detecting technology usage in source code, catalogers ensure your guardrails apply to all relevant services without manual configuration.
Learn How Lunar Works →Example Catalog Entry
This cataloger syncs component metadata into your Lunar catalog. Here's an example of a catalog entry it creates:
{
"components": {
"github.com/acme/payment-api": {
"owner": "group:default/team-payments",
"domain": "platform.payments",
"tags": ["bs-payments", "bs-tier1", "bs-type-service", "bs-lifecycle-production"]
},
"github.com/acme/web-app": {
"owner": "group:default/team-web",
"domain": "platform.frontend",
"tags": ["bs-frontend", "bs-type-website", "bs-lifecycle-production"]
}
}
}
Configuration
Configure this cataloger in your lunar-config.yml.
Inputs
| Input | Required | Default | Description |
|---|---|---|---|
paths
|
Optional |
catalog-info.yaml,catalog-info.yml
|
Comma-separated list of file paths to try in the component's repo. First match wins. Defaults match the conventional Backstage locations. |
branch
|
Required | — | Git ref to read `catalog-info.yaml` from. Empty means the repo's default branch. |
component_id_annotation
|
Optional |
github.com/project-slug
|
Annotation key on a Backstage Component used to match it against the current Lunar component. The cataloger picks the entity whose `metadata.annotations[<key>]` value, prefixed with `component_id_prefix`, equals `$LUNAR_COMPONENT_ID`. If the file has exactly one Component entity and no annotation, that single entity is used (the single-Component-per-repo case). |
component_id_prefix
|
Optional |
github.com/
|
String prepended to the annotation value when matching against `$LUNAR_COMPONENT_ID`. |
domain_annotation
|
Required | — | Annotation key on a Backstage Component used to source the domain when `spec.domain` is absent. Useful for orgs that model domains via a custom annotation rather than the canonical Backstage `spec.domain` field. When set and the matched Component carries this annotation, its value wins over `spec.domain` / `spec.system`. Leave empty to use the canonical Backstage fields only. |
tag_prefix
|
Optional |
bs-
|
Prefix added to Backstage `metadata.tags` when mapped to Lunar tags. Also applied to derived tags like `type-<spec.type>` and `lifecycle-<spec.lifecycle>`. Empty string disables the prefix. |
include_derived_tags
|
Optional |
true
|
When `true`, emits derived tags from `spec.type` (e.g. `bs-type-service`) and `spec.lifecycle` (e.g. `bs-lifecycle-production`) in addition to `metadata.tags`. |
owner_format
|
Optional |
as-is
|
How to write `spec.owner` from the catalog-info file into the Lunar `owner` field. Backstage entity refs typically look like `group:default/team-payments` or `user:default/jane`. - `as-is` — pass the value through verbatim. Matches what the existing `policies/backstage/*` checks accept (`team-payments`, `group:infra`, `user:alice` are all valid). - `bare-name` — strip the `<kind>:<namespace>/` prefix and write only the trailing name (e.g. `team-payments`). |
default_owner
|
Required | — | Fallback owner applied (verbatim) to components whose matched Backstage entity has no `spec.owner`. Leave empty to skip. |
Secrets
This cataloger requires the following secrets to be configured in Lunar:
| Secret | Description |
|---|---|
GH_TOKEN
|
GitHub token used to fetch `catalog-info.yaml` from component repos via the Contents API. Needs `Contents: Read` on the target repos (`repo` scope on a classic PAT, or `contents: read` on a fine-grained PAT / GitHub App installation token). |
Documentation
View on GitHubBackstage catalog-info.yaml Cataloger
Augments existing Lunar components with metadata read from each repo's catalog-info.yaml, fetched directly via the GitHub Contents API.
Overview
Augments existing Lunar components with owner, domain, and tag metadata from each repo's catalog-info.yaml. Runs per component via the component-cron hook, fetches the file directly from the component's GitHub repo via the Contents API, picks the matching Component entity, and writes owner / domain / tags to that component's catalog entry.
Because component-cron cannot create new components, pair this with a component-defining cataloger — typically github-org.
Synced Data
This cataloger writes to the following Catalog JSON paths on each run:
| Path | Type | Description |
|---|---|---|
.components["$LUNAR_COMPONENT_ID"].owner |
string | spec.owner of the matched Backstage Component (or default_owner fallback) |
.components["$LUNAR_COMPONENT_ID"].domain |
string | metadata.annotations[<domain_annotation>] of the matched Component when domain_annotation is configured and the annotation is present; otherwise spec.domain, falling back to spec.system when neither is set |
.components["$LUNAR_COMPONENT_ID"].tags[] |
array | metadata.tags plus derived type-* / lifecycle-* tags, all with tag_prefix |
.domains["<domain>"] |
object | Stub entry ({}) for each domain a component references. Hub catalog validation rejects components that reference unknown domains, so the cataloger writes the stub before the component entry. When the same catalog-info.yaml declares a matching kind: Domain or kind: System entity, its metadata.description and spec.owner are propagated into the stub. |
This cataloger does not define new components — that's out of scope for component-cron. Pair with a component-defining cataloger (see Layering). Domain entries are written as stubs only; for a richer global domain catalog, layer with the backstage cataloger.
Example Catalog JSON output (across multiple component runs)
{
"components": {
"github.com/acme/payment-api": {
"owner": "group:default/team-payments",
"domain": "platform.payments",
"tags": ["bs-payments", "bs-tier1", "bs-type-service", "bs-lifecycle-production"]
},
"github.com/acme/web-app": {
"owner": "group:default/team-web",
"domain": "platform.frontend",
"tags": ["bs-frontend", "bs-type-website", "bs-lifecycle-production"]
}
},
"domains": {
"platform.payments": {
"description": "Payments platform — billing, ledger, settlement",
"owner": "group:default/team-payments"
},
"platform.frontend": {}
}
}
Catalogers
| Cataloger | Description |
|---|---|
augment |
Fetches catalog-info.yaml from the current component's GitHub repo via the Contents API, parses the YAML (multi-document files supported), picks the matching Component entity, and writes its owner / domain / tags to .components["$LUNAR_COMPONENT_ID"] in the Catalog JSON |
Hook Type
| Hook | Schedule | Description |
|---|---|---|
component-cron |
0 3 * * * |
Runs daily at 03:00 UTC, once per existing component |
component-cron invokes the cataloger separately for each Lunar component currently in the catalog, exposing the component identifier as $LUNAR_COMPONENT_ID. See the cataloger-hooks reference for the full contract.
Daily at 03:00 is a conservative default — catalog-info.yaml changes typically land on the order of hours-to-days, and the schedule is offset by an hour from the standard 0 2 * * * so it lands after component-defining catalogers populate the catalog. Tighten the cadence by overriding hook.schedule in a fork.
Installation
Add to your lunar-config.yml:
catalogers:
- uses: github.com/earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.0
Set the GitHub token used to fetch catalog-info.yaml from each repo:
lunar secret set GH_TOKEN <your-github-token>
The token needs Contents: Read on every repo this cataloger will read (repo scope on a classic PAT; contents: read on a fine-grained PAT or GitHub App installation token). Many lunar-lib plugins reuse the same GH_TOKEN, so if you've already set it for github-org or any of the GitHub-API collectors, this cataloger picks it up automatically.
Because component-cron only augments existing components, a component-defining cataloger must run first (see the Layering section below).
Layering with a Component-Defining Cataloger
component-cron requires components to already exist. Run github-org first so this cataloger has something to augment:
catalogers:
- uses: github.com/earthly/lunar-lib/catalogers/github-org@v1.0.0
with:
org_name: "acme"
- uses: github.com/earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.0
Pick This or the Live Backstage Cataloger — Not Both
The data source is the same Backstage metadata; the difference is where you read it from. Pick one based on whether you run a Backstage server:
Use this cataloger (backstage-catalog-info) when… |
Use the live-API backstage cataloger when… |
|---|---|
You don't run a Backstage server — catalog-info.yaml files in repos are the source of truth |
You run a Backstage instance and want its server-side processing (group hierarchy resolution, namespace defaults, relations) |
| You want repo-file fidelity (whatever is committed is what shows up) | You want a single global pull at fixed cadence against a central API |
Running both would write to the same .components keys with the same data via different paths — wasteful and the last-declared cataloger silently wins. Don't layer them; pick the one that matches your Backstage setup.
Mapping Components to Backstage Entities
A catalog-info.yaml may declare more than one entity (monorepos commonly ship a Component + a System in one file, or several Components for sub-packages). The cataloger picks which Component corresponds to the current Lunar component using two rules:
-
Annotation match (preferred). If any
Componententity in the file has the configured annotation, only annotated entries participate in matching: the cataloger picks the one whosemetadata.annotations[<component_id_annotation>]value, prefixed withcomponent_id_prefix, equals$LUNAR_COMPONENT_ID. Defaults assume the standardgithub.com/project-slugannotation:with: component_id_annotation: "github.com/project-slug" # value: "acme/payment-api" component_id_prefix: "github.com/" # → "github.com/acme/payment-api"If no annotated entry matches, the cataloger skips silently — it refuses to guess for a repo that already uses annotations to disambiguate.
-
Single-Component fallback. If no
Componententity has the annotation and the file contains exactly oneComponent, that entity is used. This covers the common single-Component-per-repo case where the annotation isn't worth maintaining.
If the file has multiple Component entities and none are annotated, the cataloger skips silently — the YAML needs annotations to disambiguate.
Restricting Synced Kinds
This cataloger only processes kind: Component entities. Domain, System, API, Resource, User, Group, Location, etc. are ignored — they're either container-level concepts (handled by a global cataloger like backstage) or not Lunar catalog concerns.
Sourcing the Domain from a Custom Annotation
Some orgs model component domains via a custom annotation rather than the canonical Backstage spec.domain field — for example, to express a hierarchical name like engineering.tooling.observability that Backstage's flat spec.domain doesn't model well. Set domain_annotation to that key and the cataloger reads it from metadata.annotations[<key>]:
catalogers:
- uses: github.com/earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.0
with:
domain_annotation: "yourorg.example.com/domain"
When set and the matched Component has that annotation, its value wins over spec.domain / spec.system. When the annotation is absent on a given entity, the cataloger falls back to spec.domain then spec.system as usual. Leave domain_annotation empty (the default) to use only the canonical Backstage fields.
Owner Format
Backstage spec.owner is typically an entity reference like group:default/team-payments or user:default/jane, not an email. By default this cataloger passes the value through verbatim — matching what the existing policies/backstage/owner-set policy already accepts (team-payments, group:infra, user:alice are all valid).
If you'd rather store bare names, set owner_format: bare-name to strip the <kind>:<namespace>/ prefix. default_owner is also written verbatim, regardless of owner_format.
Source System
GitHub — the cataloger calls the Contents API once per component invocation to fetch catalog-info.yaml from each repo. Requirements:
GH_TOKENsecret with read access to every repo this cataloger will read (Contents: Readon a fine-grained PAT,repoon a classic PAT, orcontents: readon a GitHub App installation token).- Component IDs match
<component_id_prefix><owner>/<repo>(defaultgithub.com/<owner>/<repo>). Non-GitHub component IDs are skipped silently — this cataloger is GitHub-specific.
The cataloger makes no other external calls. YAML parsing and entity selection happen in-process; the only outbound traffic is the GitHub fetch.
Open Source
This cataloger is open source and available on GitHub. Contribute improvements, report issues, or fork it for your own use.
Ready to Automate Your Standards?
See how Lunar can turn your AGENTS.md, engineering wiki, compliance docs, or postmortem action items into automated guardrails with our 200+ built-in guardrails.