Hamburger Cross Icon
License Origins Collector - Lunar Collector

License Origins Collector

Collector Experimental Security

Scan dependency license files for geographic origin signals — country names in copyright lines, governing law clauses, and author addresses. Results are cached in Postgres for fast repeat scans across projects.

Add license-origins to your lunar-config.yml:
uses: github://earthly/lunar-lib/collectors/license-origins@v1.0.0

What This Integration Collects

This integration includes 1 collector that gather metadata from your systems.

Collector code

scan

Fetches dependencies (Rust, Go, Node.js, Python) and generates an internal SBOM to enumerate them, then scans each dependency's license files for country name mentions. Scan results are cached in a configurable Postgres database keyed by PURL@version. Writes origin results to .sbom.license_origins in Component JSON. Use alongside the syft collector for full SBOM + origin coverage.

license origins country of origin geographic origin license scanning compliance supply chain export control sbom
Book a demo

How Collectors Fit into Lunar

Lunar watches your code and CI/CD systems to collect SDLC data from config files, test results, IaC, deployment configurations, security scans, and more.

Collectors are the automatic data-gathering layer. They extract structured metadata from your repositories and pipelines, feeding it into Lunar's centralized database where guardrails evaluate it to enforce your engineering standards.

Learn How Lunar Works
1
Collectors Gather Data This Integration
Triggered by code changes or CI pipelines, collectors extract metadata from config files, tool outputs, test results, and scans
2
{ } Centralized as JSON
All data merged into each component's unified metadata document
3
Guardrails Enforce Standards
Real-time feedback in PRs and AI workflows

Example Collected Data

This collector writes structured metadata to the Component JSON. Here's an example of the data it produces:

{ } component.json Component JSON
{
  "sbom": {
    "license_origins": {
      "source": {
        "tool": "license-origins",
        "integration": "code",
        "version": "0.1.0"
      },
      "packages": [
        {
          "purl": "pkg:npm/scheduler-lib@2.1.0",
          "name": "scheduler-lib",
          "license_file": "node_modules/scheduler-lib/LICENSE",
          "countries": ["Germany"],
          "excerpts": ["Copyright 2024 Hans Mueller, Berlin, Germany"],
          "cached": false
        }
      ],
      "summary": {
        "files_scanned": 185,
        "packages_with_mentions": 1,
        "countries_found": ["Germany"],
        "cache_hits": 140,
        "cache_misses": 45
      }
    }
  }
}

Configuration

Configure this collector in your lunar-config.yml.

Inputs

Input Required Default Description
cache_enabled Optional true Enable Postgres caching of scan results (set to "false" to scan fresh every time)
cache_db_host Optional postgres Postgres host
cache_db_port Optional 5432 Postgres port
cache_db_name Optional hub Postgres database name
cache_db_user Optional lunar Postgres user (needs CREATE TABLE, INSERT, SELECT)

Secrets

This collector requires the following secrets to be configured in Lunar:

Secret Description
CACHE_DB_PASSWORD Postgres password for the cache database user. Required for caching — if not provided, caching is disabled.

Documentation

View on GitHub

License Origins Collector

Scan dependency license files for country-of-origin mentions.

Overview

This collector fetches dependencies per language ecosystem (Rust, Go, Node.js, Python), generates an internal SBOM to enumerate them, then scans each dependency's license files for country-of-origin mentions. Results are cached in Postgres keyed by PURL@version. Use alongside the syft collector for full SBOM + origin coverage. When collector dependency features are available, this will run after the syft collector instead of generating its own SBOM.

Collected Data

This collector writes to the following Component JSON paths:

Path Type Description
.sbom.license_origins.source object License origins source metadata
.sbom.license_origins.packages[] array Packages with country mentions (purl, countries, excerpts)
.sbom.license_origins.summary object Scan statistics (files scanned, cache hits/misses, countries found)

Collectors

This integration provides the following collectors (use include to select a subset):

Collector Description
scan Scans dependency license files for country name mentions (code hook)

Installation

Add to your lunar-config.yml:

collectors:
  - uses: github://earthly/lunar-lib/collectors/license-origins@main
    on: ["domain:engineering"]

To disable caching (scan fresh every time):

collectors:
  - uses: github://earthly/lunar-lib/collectors/license-origins@main
    on: ["domain:engineering"]
    with:
      cache_enabled: "false"

Optional secrets (for Postgres caching):

  • CACHE_DB_PASSWORD — Postgres password for the cache database. If not set, caching is disabled and every scan runs fresh. The connection defaults (postgres:5432/hub, user lunar) match the standard Lunar hub database. Override with cache_db_host, cache_db_port, cache_db_name, cache_db_user inputs if needed.

Open Source

This collector is open source and available on GitHub. Contribute improvements, report issues, or fork it for your own use.

View Repository

Ready to Automate Your Standards?

See how Lunar can turn your engineering wiki, compliance docs, or postmortem action items into automated guardrails with our 100+ built-in guardrails.

Works with any process
check Infrastructure conventions
check Post-mortem action items
check Security & compliance policies
check Testing & quality requirements
Automate Now
Turn any process doc into guardrails
Book a Demo