Hamburger Cross Icon
Backstage catalog-info.yaml Cataloger - Lunar Cataloger

Backstage catalog-info.yaml Cataloger

Cataloger Beta Service CatalogVcs

Augments existing Lunar components with owner, domain, and tag metadata read from each repo's `catalog-info.yaml`. Fetches files via the GitHub Contents API on a schedule. Runs per component. Use when Backstage data lives in repo files, not in a Backstage server.

Add backstage-catalog-info to your lunar-config.yml:
uses: github://earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.5

What This Integration Syncs

This integration includes 1 cataloger that sync data from your systems.

Cataloger component-cron

augment

For each existing Lunar component, fetches catalog-info.yaml from the component's GitHub repo via the Contents API, parses it (supports multi-document YAML), picks the matching Component entity, and writes its owner, domain, and tags to .components["$LUNAR_COMPONENT_ID"] in the Catalog JSON. Repos with no catalog-info.yaml, an unparseable file, or no matching Component are skipped silently.

Schedule: 0 3 * * *
service catalog backstage catalog-info components ownership augment
Book a demo

How Catalogers Fit into Lunar

Lunar catalogers sync component metadata into your Lunar catalog from external systems or source code. They can run on a schedule or be triggered by code changes to keep your service registry up-to-date.

By automatically discovering components from GitHub organizations, service registries, or by detecting technology usage in source code, catalogers ensure your guardrails apply to all relevant services without manual configuration.

Learn How Lunar Works
1
Catalogers Sync Context This Integration
Sync component metadata from service catalogs, ownership systems, and external APIs
2
Guardrails Engine
Once cataloged, components are automatically analyzed by collectors and evaluated against your guardrails

Example Catalog Entry

This cataloger syncs component metadata into your Lunar catalog. Here's an example of a catalog entry it creates:

{ } catalog entry Catalog JSON
{
  "components": {
    "github.com/acme/payment-api": {
      "owner": "group:default/team-payments",
      "domain": "platform.payments",
      "tags": ["bs-payments", "bs-tier1", "bs-type-service", "bs-lifecycle-production"]
    },
    "github.com/acme/web-app": {
      "owner": "group:default/team-web",
      "domain": "platform.frontend",
      "tags": ["bs-frontend", "bs-type-website", "bs-lifecycle-production"]
    }
  }
}

Configuration

Configure this cataloger in your lunar-config.yml.

Inputs

Input Required Default Description
paths Optional catalog-info.yaml,catalog-info.yml Comma-separated list of file paths to try in the component's repo. First match wins. Defaults match the conventional Backstage locations.
branch Required Git ref to read `catalog-info.yaml` from. Empty means the repo's default branch.
component_id_annotation Optional github.com/project-slug Annotation key on a Backstage Component used to match it against the current Lunar component. The cataloger picks the entity whose `metadata.annotations[<key>]` value, prefixed with `component_id_prefix`, equals `$LUNAR_COMPONENT_ID`. If the file has exactly one Component entity and no annotation, that single entity is used (the single-Component-per-repo case).
component_id_prefix Optional github.com/ String prepended to the annotation value when matching against `$LUNAR_COMPONENT_ID`.
domain_annotation Required Annotation key on a Backstage Component used to source the domain when `spec.domain` is absent. Useful for orgs that model domains via a custom annotation rather than the canonical Backstage `spec.domain` field. When set and the matched Component carries this annotation, its value wins over `spec.domain` / `spec.system`. Leave empty to use the canonical Backstage fields only.
tag_prefix Optional bs- Prefix added to Backstage `metadata.tags` when mapped to Lunar tags. Also applied to derived tags like `type-<spec.type>` and `lifecycle-<spec.lifecycle>`. Empty string disables the prefix.
include_derived_tags Optional true When `true`, emits derived tags from `spec.type` (e.g. `bs-type-service`) and `spec.lifecycle` (e.g. `bs-lifecycle-production`) in addition to `metadata.tags`.
owner_format Optional as-is How to write `spec.owner` from the catalog-info file into the Lunar `owner` field. Backstage entity refs typically look like `group:default/team-payments` or `user:default/jane`. - `as-is` — pass the value through verbatim. Matches what the existing `policies/backstage/*` checks accept (`team-payments`, `group:infra`, `user:alice` are all valid). - `bare-name` — strip the `<kind>:<namespace>/` prefix and write only the trailing name (e.g. `team-payments`).
default_owner Required Fallback owner applied (verbatim) to components whose matched Backstage entity has no `spec.owner`. Leave empty to skip.

Secrets

This cataloger requires the following secrets to be configured in Lunar:

Secret Description
GH_TOKEN GitHub token used to fetch `catalog-info.yaml` from component repos via the Contents API. Needs `Contents: Read` on the target repos (`repo` scope on a classic PAT, or `contents: read` on a fine-grained PAT / GitHub App installation token).

Documentation

View on GitHub

Backstage catalog-info.yaml Cataloger

Augments existing Lunar components with metadata read from each repo's catalog-info.yaml, fetched directly via the GitHub Contents API.

Overview

Augments existing Lunar components with owner, domain, and tag metadata from each repo's catalog-info.yaml. Runs per component via the component-cron hook, fetches the file directly from the component's GitHub repo via the Contents API, picks the matching Component entity, and writes owner / domain / tags to that component's catalog entry.

Because component-cron cannot create new components, pair this with a component-defining cataloger — typically github-org.

Synced Data

This cataloger writes to the following Catalog JSON paths on each run:

Path Type Description
.components["$LUNAR_COMPONENT_ID"].owner string spec.owner of the matched Backstage Component (or default_owner fallback)
.components["$LUNAR_COMPONENT_ID"].domain string metadata.annotations[<domain_annotation>] of the matched Component when domain_annotation is configured and the annotation is present; otherwise spec.domain, falling back to spec.system when neither is set
.components["$LUNAR_COMPONENT_ID"].tags[] array metadata.tags plus derived type-* / lifecycle-* tags, all with tag_prefix
.domains["<domain>"] object Stub entry ({}) for each domain a component references. Hub catalog validation rejects components that reference unknown domains, so the cataloger writes the stub before the component entry. When the same catalog-info.yaml declares a matching kind: Domain or kind: System entity, its metadata.description and spec.owner are propagated into the stub.

This cataloger does not define new components — that's out of scope for component-cron. Pair with a component-defining cataloger (see Layering). Domain entries are written as stubs only; for a richer global domain catalog, layer with the backstage cataloger.

Example Catalog JSON output (across multiple component runs)
{
  "components": {
    "github.com/acme/payment-api": {
      "owner": "group:default/team-payments",
      "domain": "platform.payments",
      "tags": ["bs-payments", "bs-tier1", "bs-type-service", "bs-lifecycle-production"]
    },
    "github.com/acme/web-app": {
      "owner": "group:default/team-web",
      "domain": "platform.frontend",
      "tags": ["bs-frontend", "bs-type-website", "bs-lifecycle-production"]
    }
  },
  "domains": {
    "platform.payments": {
      "description": "Payments platform — billing, ledger, settlement",
      "owner": "group:default/team-payments"
    },
    "platform.frontend": {}
  }
}

Catalogers

Cataloger Description
augment Fetches catalog-info.yaml from the current component's GitHub repo via the Contents API, parses the YAML (multi-document files supported), picks the matching Component entity, and writes its owner / domain / tags to .components["$LUNAR_COMPONENT_ID"] in the Catalog JSON

Hook Type

Hook Schedule Description
component-cron 0 3 * * * Runs daily at 03:00 UTC, once per existing component

component-cron invokes the cataloger separately for each Lunar component currently in the catalog, exposing the component identifier as $LUNAR_COMPONENT_ID. See the cataloger-hooks reference for the full contract.

Daily at 03:00 is a conservative default — catalog-info.yaml changes typically land on the order of hours-to-days, and the schedule is offset by an hour from the standard 0 2 * * * so it lands after component-defining catalogers populate the catalog. Tighten the cadence by overriding hook.schedule in a fork.

Installation

Add to your lunar-config.yml:

catalogers:
  - uses: github.com/earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.0

Set the GitHub token used to fetch catalog-info.yaml from each repo:

lunar secret set GH_TOKEN <your-github-token>

The token needs Contents: Read on every repo this cataloger will read (repo scope on a classic PAT; contents: read on a fine-grained PAT or GitHub App installation token). Many lunar-lib plugins reuse the same GH_TOKEN, so if you've already set it for github-org or any of the GitHub-API collectors, this cataloger picks it up automatically.

Because component-cron only augments existing components, a component-defining cataloger must run first (see the Layering section below).

Layering with a Component-Defining Cataloger

component-cron requires components to already exist. Run github-org first so this cataloger has something to augment:

catalogers:
  - uses: github.com/earthly/lunar-lib/catalogers/github-org@v1.0.0
    with:
      org_name: "acme"

  - uses: github.com/earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.0

Pick This or the Live Backstage Cataloger — Not Both

The data source is the same Backstage metadata; the difference is where you read it from. Pick one based on whether you run a Backstage server:

Use this cataloger (backstage-catalog-info) when… Use the live-API backstage cataloger when…
You don't run a Backstage server — catalog-info.yaml files in repos are the source of truth You run a Backstage instance and want its server-side processing (group hierarchy resolution, namespace defaults, relations)
You want repo-file fidelity (whatever is committed is what shows up) You want a single global pull at fixed cadence against a central API

Running both would write to the same .components keys with the same data via different paths — wasteful and the last-declared cataloger silently wins. Don't layer them; pick the one that matches your Backstage setup.

Mapping Components to Backstage Entities

A catalog-info.yaml may declare more than one entity (monorepos commonly ship a Component + a System in one file, or several Components for sub-packages). The cataloger picks which Component corresponds to the current Lunar component using two rules:

  1. Annotation match (preferred). If any Component entity in the file has the configured annotation, only annotated entries participate in matching: the cataloger picks the one whose metadata.annotations[<component_id_annotation>] value, prefixed with component_id_prefix, equals $LUNAR_COMPONENT_ID. Defaults assume the standard github.com/project-slug annotation:

    with:
      component_id_annotation: "github.com/project-slug"  # value: "acme/payment-api"
      component_id_prefix: "github.com/"                    # → "github.com/acme/payment-api"
    

    If no annotated entry matches, the cataloger skips silently — it refuses to guess for a repo that already uses annotations to disambiguate.

  2. Single-Component fallback. If no Component entity has the annotation and the file contains exactly one Component, that entity is used. This covers the common single-Component-per-repo case where the annotation isn't worth maintaining.

If the file has multiple Component entities and none are annotated, the cataloger skips silently — the YAML needs annotations to disambiguate.

Restricting Synced Kinds

This cataloger only processes kind: Component entities. Domain, System, API, Resource, User, Group, Location, etc. are ignored — they're either container-level concepts (handled by a global cataloger like backstage) or not Lunar catalog concerns.

Sourcing the Domain from a Custom Annotation

Some orgs model component domains via a custom annotation rather than the canonical Backstage spec.domain field — for example, to express a hierarchical name like engineering.tooling.observability that Backstage's flat spec.domain doesn't model well. Set domain_annotation to that key and the cataloger reads it from metadata.annotations[<key>]:

catalogers:
  - uses: github.com/earthly/lunar-lib/catalogers/backstage-catalog-info@v1.0.0
    with:
      domain_annotation: "yourorg.example.com/domain"

When set and the matched Component has that annotation, its value wins over spec.domain / spec.system. When the annotation is absent on a given entity, the cataloger falls back to spec.domain then spec.system as usual. Leave domain_annotation empty (the default) to use only the canonical Backstage fields.

Owner Format

Backstage spec.owner is typically an entity reference like group:default/team-payments or user:default/jane, not an email. By default this cataloger passes the value through verbatim — matching what the existing policies/backstage/owner-set policy already accepts (team-payments, group:infra, user:alice are all valid).

If you'd rather store bare names, set owner_format: bare-name to strip the <kind>:<namespace>/ prefix. default_owner is also written verbatim, regardless of owner_format.

Source System

GitHub — the cataloger calls the Contents API once per component invocation to fetch catalog-info.yaml from each repo. Requirements:

  • GH_TOKEN secret with read access to every repo this cataloger will read (Contents: Read on a fine-grained PAT, repo on a classic PAT, or contents: read on a GitHub App installation token).
  • Component IDs match <component_id_prefix><owner>/<repo> (default github.com/<owner>/<repo>). Non-GitHub component IDs are skipped silently — this cataloger is GitHub-specific.

The cataloger makes no other external calls. YAML parsing and entity selection happen in-process; the only outbound traffic is the GitHub fetch.

Open Source

This cataloger is open source and available on GitHub. Contribute improvements, report issues, or fork it for your own use.

View Repository

Ready to Automate Your Standards?

See how Lunar can turn your AGENTS.md, engineering wiki, compliance docs, or postmortem action items into automated guardrails with our 200+ built-in guardrails.

Works with any process
check AI agent rules & prompt files
check Post-mortem action items
check Security & compliance policies
check Testing & quality requirements
Automate Now
Paste your AGENTS.md or manual process doc and get guardrails in minutes
Book a Demo