License Origins Collector
Scan dependency license files for geographic origin signals — country names in copyright lines, governing law clauses, and author addresses. Results are cached in Postgres for fast repeat scans across projects.
license-origins to your lunar-config.yml:uses: github://earthly/lunar-lib/collectors/license-origins@v1.0.0
What This Integration Collects
This integration includes 1 collector that gather metadata from your systems.
scan
Fetches dependencies (Rust, Go, Node.js, Python) and generates an internal SBOM to enumerate them, then scans each dependency's license files for country name mentions. Scan results are cached in a configurable Postgres database keyed by PURL@version. Writes origin results to .sbom.license_origins in Component JSON. Use alongside the syft collector for full SBOM + origin coverage.
How Collectors Fit into Lunar
Lunar watches your code and CI/CD systems to collect SDLC data from config files, test results, IaC, deployment configurations, security scans, and more.
Collectors are the automatic data-gathering layer. They extract structured metadata from your repositories and pipelines, feeding it into Lunar's centralized database where guardrails evaluate it to enforce your engineering standards.
Learn How Lunar Works →Example Collected Data
This collector writes structured metadata to the Component JSON. Here's an example of the data it produces:
{
"sbom": {
"license_origins": {
"source": {
"tool": "license-origins",
"integration": "code",
"version": "0.1.0"
},
"packages": [
{
"purl": "pkg:npm/scheduler-lib@2.1.0",
"name": "scheduler-lib",
"license_file": "node_modules/scheduler-lib/LICENSE",
"countries": ["Germany"],
"excerpts": ["Copyright 2024 Hans Mueller, Berlin, Germany"],
"cached": false
}
],
"summary": {
"files_scanned": 185,
"packages_with_mentions": 1,
"countries_found": ["Germany"],
"cache_hits": 140,
"cache_misses": 45
}
}
}
}
Configuration
Configure this collector in your lunar-config.yml.
Inputs
| Input | Required | Default | Description |
|---|---|---|---|
cache_enabled
|
Optional |
true
|
Enable Postgres caching of scan results (set to "false" to scan fresh every time) |
cache_db_host
|
Optional |
postgres
|
Postgres host |
cache_db_port
|
Optional |
5432
|
Postgres port |
cache_db_name
|
Optional |
hub
|
Postgres database name |
cache_db_user
|
Optional |
lunar
|
Postgres user (needs CREATE TABLE, INSERT, SELECT) |
Secrets
This collector requires the following secrets to be configured in Lunar:
| Secret | Description |
|---|---|
CACHE_DB_PASSWORD
|
Postgres password for the cache database user. Required for caching — if not provided, caching is disabled. |
Documentation
View on GitHubLicense Origins Collector
Scan dependency license files for country-of-origin mentions.
Overview
This collector fetches dependencies per language ecosystem (Rust, Go, Node.js, Python), generates an internal SBOM to enumerate them, then scans each dependency's license files for country-of-origin mentions. Results are cached in Postgres keyed by PURL@version. Use alongside the syft collector for full SBOM + origin coverage. When collector dependency features are available, this will run after the syft collector instead of generating its own SBOM.
Collected Data
This collector writes to the following Component JSON paths:
| Path | Type | Description |
|---|---|---|
.sbom.license_origins.source |
object | License origins source metadata |
.sbom.license_origins.packages[] |
array | Packages with country mentions (purl, countries, excerpts) |
.sbom.license_origins.summary |
object | Scan statistics (files scanned, cache hits/misses, countries found) |
Collectors
This integration provides the following collectors (use include to select a subset):
| Collector | Description |
|---|---|
scan |
Scans dependency license files for country name mentions (code hook) |
Installation
Add to your lunar-config.yml:
collectors:
- uses: github://earthly/lunar-lib/collectors/license-origins@main
on: ["domain:engineering"]
To disable caching (scan fresh every time):
collectors:
- uses: github://earthly/lunar-lib/collectors/license-origins@main
on: ["domain:engineering"]
with:
cache_enabled: "false"
Optional secrets (for Postgres caching):
CACHE_DB_PASSWORD— Postgres password for the cache database. If not set, caching is disabled and every scan runs fresh. The connection defaults (postgres:5432/hub, userlunar) match the standard Lunar hub database. Override withcache_db_host,cache_db_port,cache_db_name,cache_db_userinputs if needed.
Open Source
This collector is open source and available on GitHub. Contribute improvements, report issues, or fork it for your own use.
Common Use Cases
Explore guardrails that use data from License Origins Collector.
Ready to Automate Your Standards?
See how Lunar can turn your engineering wiki, compliance docs, or postmortem action items into automated guardrails with our 100+ built-in guardrails.