The compliance tax: what it actually costs to ship software to the U.S. government

17 minute read     Updated:

Vlad A. Ionescu
Vlad A. Ionescu

A platform engineering lead at a mid-size defense technology company recently walked us through their release process. Forty-seven compliance items, each verified by hand. Ten hours of platform engineering time per release, forty hours per month — just confirming that developers did what the security and compliance teams already told them to do.

This is not unusual. In conversations with engineering organizations that ship software to the Department of War (DoW, formerly DoD) and the broader federal government, the pattern is remarkably consistent: small platform teams, large compliance surface areas, and a growing gap between what’s required and what’s actually enforceable.

Government buying is shifting. Federal customers increasingly want products and outcomes, not development teams on retainer. Companies that serve the DoW — whether they build defense technology or deliver consulting at scale — face a shared mandate: ship faster, deliver consistently, and prove compliance continuously rather than retroactively.

This is compounded by a structural shift across the industry. Many defense-focused companies are transitioning from pure services and forward-deployed engineering toward productized, repeatable offerings built on shared platforms. That transition changes everything about compliance. A prototype built in 48 hours for a single customer doesn’t need an ATO. A product that ships to multiple government customers across classification levels does — and the compliance burden arrives all at once.

TL;DR

  • Federal compliance frameworks (FedRAMP, CMMC, STIG, ITAR, EO 14028) create dense, overlapping requirements that engineering teams must satisfy continuously — not just at audit time.
  • Organizations routinely spend 40+ hours/month on manual compliance verification, with single artifact issues causing multi-month program delays.
  • The structural challenge is the gap between “we know the standards” and “we can enforce them at scale.” That gap widens with every new service, repo, and compliance framework.
  • Most commercial DevOps tooling is disqualified by data residency and air-gap requirements before the evaluation starts.
  • Automated SDLC instrumentation — collecting compliance evidence continuously at build time — can reduce audit prep from weeks to hours.

The regulatory landscape

If you’re in this space, you already know the frameworks: FedRAMP, CMMC, STIG, ITAR, EO 14028, NIST 800-171, FISMA. Each has its own scope, audit cadence, and evidence requirements — and they overlap in ways that multiply the engineering burden rather than simplifying it.

What’s changed recently is the direction of travel. CMMC is moving from self-attestation to third-party certification. FedRAMP now requires continuous monitoring, not annual snapshots. EO 14028 mandated SBOMs and NIST SSDF attestation for all software sold to the federal government. The theme is the same everywhere: prove it continuously, not just at audit time.

The engineering teams responsible for satisfying these requirements are typically a fraction of the size of the development organizations they serve. Most defense technology companies operate at IL4–IL5, which means every tool in the SDLC — CI/CD platforms, scanners, SBOM generators, enforcement engines — must deploy in GovCloud or air-gapped environments, or not touch the data at all.

Where engineering organizations break down

The regulatory landscape is demanding but well-documented. The harder problem is operational: how do you actually satisfy these requirements at engineering scale, continuously, without drowning in toil?

These are productionalization pains. During prototyping and early forward-deployed work, compliance is manageable — or deferred entirely. The moment a prototype becomes a product shipping to government customers at scale, SBOMs matter, dependency provenance matters, STIG verification matters, and the platform team needs to prove it all continuously. The failure modes below recur across company sizes, contract types, and technology stacks.

Compliance is discovered at release time

The most common failure mode is late discovery. Developers skip required steps during normal development — SBOM uploads, security scans, approved base images, license checks — and nobody notices until the release train assembles.

At one organization — roughly fifty engineers total — the platform team spends ten hours per release on manual compliance checks, about forty hours per month. And this is a small org. The checks themselves aren’t complex. The problem is that there’s no mechanism to verify compliance continuously, so it accumulates as a pre-release fire drill. At organizations with hundreds or thousands of engineers, the same pattern plays out at a scale that manual processes can’t absorb.

In government contexts, the consequences of late discovery are qualitatively different from commercial software. Government deliveries have 18-month sales cycles. A compliance miss doesn’t delay a sprint — it can delay a contract by months and jeopardize recompetes.

Audit evidence is assembled by hand

Government compliance frameworks require demonstrable evidence that practices are followed. SOC 2, NIST, FISMA, CMMC, STIG, ATO — all demand proof, not promises. Without automated enforcement, that evidence is assembled retroactively. And hand-assembled evidence is easy to game.

One organization discovered during a SOC 2 audit that developers were gaming Jira ticket linking. A team had created a catch-all ticket named “Change” and linked all commits to it — technically satisfying the traceability requirement while providing zero actual traceability. Auditors sampling 30 tickets and finding 10 with issues means a failed audit.

ATO evidence packaging is another bottleneck. We’ve heard estimates of weeks of manual effort per DoW delivery, with STIG alone requiring verification of hundreds of requirements. The evidence exists somewhere in the pipeline — scattered across CI logs, scan reports, and artifact registries. But assembling it into auditable form is labor-intensive, error-prone, and repeated before every delivery.

One bad artifact can derail a program

In commercial software, a compliance miss usually means a delayed release. In defense, the consequences are an order of magnitude larger.

One organization described an incident where a third-party operating system’s SBOM contained a foreign country name hundreds of times, triggering government red flags about foreign-controlled software in a defense system. The result: a three-month project delay, active legal exposure, and an issue that remains unresolved.

The asymmetry matters. In commercial software, you can hotfix and move on. In defense contracting, a single artifact anomaly — an unexpected SBOM entry, an unapproved license, a misconfigured container — triggers investigation, legal review, and contract risk that extends well beyond the engineering organization.

No central enforcement at scale

Platform teams at defense companies provide reusable CI components — GitHub Actions, pipeline templates, approved configurations — but have no way to verify that developers actually use them. Standards are broadcast via Slack, documentation, and training, then hoped for.

The structural challenge is the ratio. A platform team of five engineers serving fifty developers. A platform team of ten serving three thousand. In every conversation, the numbers are lopsided. The platform team can build the right path. They cannot police whether anyone walks it.

We’ve heard engineering leaders at large defense organizations describe “a lot of trauma in the standardization space.” Previous attempts at enforcement — mandated tooling, required stacks, top-down process changes — created backlash without solving the underlying compliance problem. But without enforcement, standards are optional. And optional standards, in regulated environments, are a liability.

The downstream effect is duplication. Without enforced practices, every team independently figures out CI/CD setup, security scanning, SBOM generation, and compliance reporting. We’ve heard estimates of teams spending two or more years establishing delivery practices that could have been solved once and reused across the organization.

Policy layers drift apart

Defense environments require guardrails at multiple layers: in CI/CD pipelines and at runtime (Kubernetes admission control, cloud SCPs, network policies). Maintaining these independently creates two sources of truth that inevitably diverge.

This is one of the hardest unsolved problems we hear about. You define a policy in Kyverno or OPA for deployment gating, then manually recreate the same check in CI. When they disagree — and they will — deployments fail and teams spend hours determining which layer is wrong. The toil compounds with every new standard, every new service, and every new enforcement layer.

Defense environments are “defense in depth” by design. Multiple enforcement layers aren’t optional. But maintaining them in sync is engineering overhead that grows superlinearly with the compliance surface area.

Self-hosted deployment is table stakes

Any tooling that touches the SDLC — code, build artifacts, metadata, logs — must deploy in the customer’s VPC, in GovCloud, or in air-gapped environments. This requirement is non-negotiable and universal across the organizations we’ve spoken with.

SaaS deployment is off the table for production work. Even evaluations with SaaS tooling require legal review to confirm that source code and build artifacts aren’t exposed to third parties. At IL4–IL5, the hard requirement is self-hosted and GovCloud-compatible. At IL6 and above — SIPR, JWICS — there’s no network connectivity at all. Artifacts cross air gaps via data diode or physical media, and any tooling must operate fully disconnected.

This constraint alone eliminates most commercial DevOps tooling from consideration before the technical evaluation begins.

The compounding effect

These failure modes don’t exist in isolation — they compound. No central enforcement means developers skip compliance steps. Late discovery means the platform team scrambles before every release. Manual evidence assembly means audits become high-risk events. Artifact fragility means one unexpected entry can trigger months of delay. Policy layers drift because they’re maintained independently. And deployment constraints mean the pool of tooling that could help is far smaller than it appears.

The result: defense-focused engineering organizations spend a disproportionate share of platform engineering time on compliance toil — time that could be spent improving developer experience, delivery speed, and system reliability. And the penalties for getting it wrong aren’t sprints lost. They’re contracts lost.

What a solution requires

Based on these conversations, any tooling that credibly addresses the compliance gap needs several properties:

Continuous, not periodic. Compliance data must be collected at build time, not assembled retroactively before audits. The shift from annual audits to continuous compliance — FedRAMP continuous monitoring, CMMC third-party certification, EO 14028 attestation — makes periodic, manual approaches structurally inadequate.

Central, not per-repo. Enforcement that requires repo-by-repo integration doesn’t scale when the ratio of platform engineers to developers is 1:50 or worse. The mechanism must be deployable once and effective everywhere.

Gradual, not all-or-nothing. Organizations that have been burned by top-down mandates need a path from visibility to reporting to blocking — bringing teams along without creating the backlash that derailed previous standardization attempts.

Self-hosted and air-gap capable. Any solution that can’t deploy in GovCloud or air-gapped environments is disqualified before evaluation begins.

Evidence as a byproduct. The ideal state is that compliance evidence — SBOMs, scan results, tool versions, configuration state — is generated automatically as a side effect of normal development, without manual intervention or per-pipeline changes.

Composable across products. As organizations extract shared infrastructure into platform layers, compliance evidence should compose. A platform component that’s already verified shouldn’t require re-verification in every product that inherits from it. This matters especially for companies managing multiple ATOs across different customers and classification levels — the evidence base needs to be reusable, not rebuilt from scratch each time.

How Lunar fits in

This is the problem space Earthly Lunar was built for. Lunar is a guardrails engine that centrally instruments repositories and CI/CD pipelines, collects structured SDLC posture data, and continuously evaluates engineering and compliance standards directly in pull requests.

For organizations shipping software to the DoW, the most relevant capability is automated compliance data collection. Lunar’s collectors capture SBOMs, security scan results, dependency graphs, tool versions, container configurations, and other compliance-relevant metadata at build time — without requiring changes to individual pipelines. This data is structured, stored centrally, and evaluated against guardrails that map to requirements in NIST 800-171, STIG, and EO 14028.

The practical effect is that audit evidence is generated continuously as a byproduct of normal development, rather than assembled by hand before each delivery. Platform teams that routinely spend more than half their time on compliance verification can redirect that capacity toward engineering work that actually improves their systems.

Lunar is self-hosted and deploys in customer VPCs, including GovCloud and air-gapped environments.

If the compliance tax described here sounds familiar, let’s talk.

Turn your engineering standards into automated guardrails that provide feedback directly in pull requests, with 100+ guardrails included out of the box and support for the tools and CI/CD systems you already have.

Vlad A. Ionescu
Founder of Earthly. Founder of ShiftLeft. Ex Google. Ex VMware. Co-author RabbitMQ Erlang Client.

Updated:

Published:

Get notified about new articles!
We won't send you spam. Unsubscribe at any time.