Best Practices

Threat modeling the pipeline, key management, false positive handling, compliance, common pitfalls, scaling

Best Practices

The operational realities of running supply chain security at scale.

Threat Model Your Pipeline

Before adopting tools, understand what you're defending against:

Threat	What it looks like
Source compromise	Attacker pushes to your repo (stolen creds, malicious PR)
Dependency injection	Malicious update to a dep you use (typosquat, hijacked maintainer)
Build compromise	CI runner runs attacker code (poisoned GitHub Action, escaped sandbox)
Artifact tampering	Image swapped between build and deploy (compromised registry, MITM)
Key theft	Long-lived signing key stolen; arbitrary signing
Deploy bypass	Direct `kubectl apply` skips signed pipeline
Runtime injection	`kubectl exec` or container escape modifies running pod

Match each threat to a control. The defense should be layered — single defenses fail under sophisticated attack.

Key Management

If you must have keys:

No human-readable keys on disk beyond bootstrap. Use HSM (AWS KMS, GCP KMS, Azure Key Vault, YubiHSM).
Short-lived where possible. Sigstore keyless is the gold standard — no key to manage.
Separate keys per environment. Dev signing key compromise shouldn't affect prod.
Rotation procedure tested before you need it.
Witness signatures for irreplaceable keys: M-of-N signing for the root key.
Audit log: every signing operation logged with identity, artifact digest, timestamp.

Cosign supports KMS providers directly:

cosign sign --key awskms:///alias/cosign-key $DIGEST

The private key never leaves the HSM.

Scaling Beyond One Image

A few images: copy-paste workflows. Hundreds: template. Patterns:

Reusable GitHub Actions workflow (uses:) that builds + signs + SBOMs in a standard way. All services call it.
Make / Justfile / Bazel target that wraps the chain locally and in CI.
IDP golden path (Internal Developer Platforms) that scaffolds a new service with signing wired up by default.
Compliance dashboards (Backstage / Dependency-Track) that show coverage: signed % / SBOM'd % / SLSA level per service.

The metric: percentage of production images that pass full verification. Trend this; teams gravitate up when they see they're outliers.

Handling False Positives

Vulnerability scanners are noisy. A real production scan finds:

Critical CVEs in code paths you don't use
Vulnerabilities in test-only dependencies
Issues in base image that the distro hasn't backported yet
Disputed CVEs (security researchers and vendors disagreeing)

Without triage, alert fatigue takes over and real issues get missed. Process:

Triage rules: auto-suppress development-only deps, accepted base images.
VEX statements for "vulnerable but not exploitable" cases — signed declarations.
SLA on critical: 14 days to fix; document if not.
Block on new critical in PRs (Renovate/Dependabot policy).
Quarterly review of suppressions: still valid? Still vulnerable?

The goal: every open finding has a status (fixed, in-progress, accepted with rationale). None should be "ignored because there are too many."

CI Hardening

If your CI is compromised, so is your supply chain:

Ephemeral runners: fresh runner per job, no state carries between jobs.
OIDC for cloud auth: no long-lived secrets in CI. GitHub Actions / GitLab CI both support OIDC to AWS/GCP/Azure.
Pin Actions / Plugins to commit SHA, not version tags (mutable). actions/checkout@v4 can be hijacked; actions/checkout@abc123... cannot.
permissions: read-all by default, write only where needed.
Branch protection: require reviewed PRs to merge to main.
Required workflows: GitHub Enterprise can enforce certain workflows run.
Forks treated as untrusted: PRs from forks don't get secrets access (default behavior; verify).

The CI is now a production system. Apply production-grade controls to it.

Registry Hygiene

The container registry is high-value. Protect:

Authentication required for push; pull may be public if intentional.
Image scanning at registry level (Harbor, ECR, GHCR built-in scanners).
Immutable tags for releases — v1.2.3 can never be re-pushed.
Quarantine new images until scan completes; only verified images move to production paths.
Cleanup policy: delete old, untagged images; reduces attack surface and storage cost.
Replicate to secondary registry for DR. Don't lose the ability to deploy because GitHub Container Registry is down.

Documentation

Auditors will ask:

How are images signed? Point to the workflow file.
How is signing verified? Point to the Kyverno / Sigstore policy.
Show the chain of custody for v1.2.3. cosign tree, cosign verify-attestation, output stored.
Show vulnerability triage process. SLA doc; example tickets.
Show response to log4j. SBOM query; affected services; patch timeline.

Document each. The documents are the evidence of process, separate from the technical controls.

Compliance Mapping

Match controls to frameworks:

Framework	Control	Supply chain answer
SOC 2 CC6.6	Logical access	Signed images + admission policy
SOC 2 CC8.1	Change management	SLSA provenance + GitOps
NIST 800-218 PS.3.1	Archive software	SBOM + immutable tags + signed
NIST 800-218 PW.4.1	Approved deps	Lock files + scan in CI
EU CRA	Vulnerability handling	Dependency-Track + VEX + patch SLA
EO 14028	SBOM for federal	Syft-generated SBOM, signed, retained

A single supply chain practice answers many controls. Document the mapping; auditors are grateful.

What to Bypass and What Not To

The pipeline must remain operable under stress. Decide upfront:

Emergency hotfix: still goes through CI + signing, even if expedited.
Off-hours deploy without approver: explicit break-glass procedure, logged, reviewed next business day.
Signing infrastructure down: cache last good signatures; don't route around signing entirely.
Sigstore public infrastructure down: most teams run their own Rekor/Fulcio mirror or have a vendor fallback.

Decide before you need to. "We'll figure it out when it breaks" produces unsigned production deploys.

Common Pitfalls

Theater without enforcement. Signing every image, but admission doesn't verify. Verify before celebrating.

Identity not pinned correctly. Cosign verifies "an image is signed" but doesn't check who. Always pin to specific OIDC issuer + identity regexp.

SBOM stale or wrong tool. Building a fresh SBOM is fast; using one from last quarter for today's image is meaningless.

Block on every CVE. Critical & high might block; medium should rate-limit; low should track. Otherwise you block production for cosmetic issues.

Forgetting base images. Your code is clean. The base image has 30 CVEs. Plan base image refresh as a continuous task.

Treating signing as a one-time effort. Signing infrastructure breaks (cert expiry, registry change, etc.). Monitor signing success rate the way you monitor build success rate.

Hidden direct-pulls. A CI step that does pip install from PyPI bypasses your proxy and signing chain. Audit the actual network calls.

Verification-only in CI, not on deploy. Verifying in CI proves the build is clean. Verifying at admission proves what's deployed is clean. Do both.

Continuous Improvement

The threat landscape evolves. Treat supply chain security as a continuous practice:

Quarterly threat model review
Pen tests targeting the build pipeline
Subscribe to OSV, GitHub Security advisories
Monitor Sigstore project security advisories
Keep tooling (cosign, syft, grype) updated; old versions miss new vuln formats

Checklist

What's Next

You have a supply chain practice. Connect it to:

CI/CD — signing belongs in the pipeline
Policy as Code — admission policies enforce signatures
GitOps — Git is the trusted source; signatures bind artifacts to source
Secrets — Vault holds CI tokens and signing keys (if any)
Internal Developer Platforms — golden paths emit signed-by-default services

Best Practices

On this page