Best Practices
Threat modeling the pipeline, key management, false positive handling, compliance, common pitfalls, scaling
Best Practices
The operational realities of running supply chain security at scale.
Threat Model Your Pipeline
Before adopting tools, understand what you're defending against:
| Threat | What it looks like |
|---|---|
| Source compromise | Attacker pushes to your repo (stolen creds, malicious PR) |
| Dependency injection | Malicious update to a dep you use (typosquat, hijacked maintainer) |
| Build compromise | CI runner runs attacker code (poisoned GitHub Action, escaped sandbox) |
| Artifact tampering | Image swapped between build and deploy (compromised registry, MITM) |
| Key theft | Long-lived signing key stolen; arbitrary signing |
| Deploy bypass | Direct kubectl apply skips signed pipeline |
| Runtime injection | kubectl exec or container escape modifies running pod |
Match each threat to a control. The defense should be layered — single defenses fail under sophisticated attack.
Key Management
If you must have keys:
- No human-readable keys on disk beyond bootstrap. Use HSM (AWS KMS, GCP KMS, Azure Key Vault, YubiHSM).
- Short-lived where possible. Sigstore keyless is the gold standard — no key to manage.
- Separate keys per environment. Dev signing key compromise shouldn't affect prod.
- Rotation procedure tested before you need it.
- Witness signatures for irreplaceable keys: M-of-N signing for the root key.
- Audit log: every signing operation logged with identity, artifact digest, timestamp.
Cosign supports KMS providers directly:
cosign sign --key awskms:///alias/cosign-key $DIGESTThe private key never leaves the HSM.
Scaling Beyond One Image
A few images: copy-paste workflows. Hundreds: template. Patterns:
- Reusable GitHub Actions workflow (
uses:) that builds + signs + SBOMs in a standard way. All services call it. - Make / Justfile / Bazel target that wraps the chain locally and in CI.
- IDP golden path (Internal Developer Platforms) that scaffolds a new service with signing wired up by default.
- Compliance dashboards (Backstage / Dependency-Track) that show coverage: signed % / SBOM'd % / SLSA level per service.
The metric: percentage of production images that pass full verification. Trend this; teams gravitate up when they see they're outliers.
Handling False Positives
Vulnerability scanners are noisy. A real production scan finds:
- Critical CVEs in code paths you don't use
- Vulnerabilities in test-only dependencies
- Issues in base image that the distro hasn't backported yet
- Disputed CVEs (security researchers and vendors disagreeing)
Without triage, alert fatigue takes over and real issues get missed. Process:
- Triage rules: auto-suppress development-only deps, accepted base images.
- VEX statements for "vulnerable but not exploitable" cases — signed declarations.
- SLA on critical: 14 days to fix; document if not.
- Block on new critical in PRs (Renovate/Dependabot policy).
- Quarterly review of suppressions: still valid? Still vulnerable?
The goal: every open finding has a status (fixed, in-progress, accepted with rationale). None should be "ignored because there are too many."
CI Hardening
If your CI is compromised, so is your supply chain:
- Ephemeral runners: fresh runner per job, no state carries between jobs.
- OIDC for cloud auth: no long-lived secrets in CI. GitHub Actions / GitLab CI both support OIDC to AWS/GCP/Azure.
- Pin Actions / Plugins to commit SHA, not version tags (mutable).
actions/checkout@v4can be hijacked;actions/checkout@abc123...cannot. permissions: read-allby default,writeonly where needed.- Branch protection: require reviewed PRs to merge to main.
- Required workflows: GitHub Enterprise can enforce certain workflows run.
- Forks treated as untrusted: PRs from forks don't get secrets access (default behavior; verify).
The CI is now a production system. Apply production-grade controls to it.
Registry Hygiene
The container registry is high-value. Protect:
- Authentication required for push; pull may be public if intentional.
- Image scanning at registry level (Harbor, ECR, GHCR built-in scanners).
- Immutable tags for releases —
v1.2.3can never be re-pushed. - Quarantine new images until scan completes; only verified images move to production paths.
- Cleanup policy: delete old, untagged images; reduces attack surface and storage cost.
- Replicate to secondary registry for DR. Don't lose the ability to deploy because GitHub Container Registry is down.
Documentation
Auditors will ask:
- How are images signed? Point to the workflow file.
- How is signing verified? Point to the Kyverno / Sigstore policy.
- Show the chain of custody for v1.2.3.
cosign tree,cosign verify-attestation, output stored. - Show vulnerability triage process. SLA doc; example tickets.
- Show response to log4j. SBOM query; affected services; patch timeline.
Document each. The documents are the evidence of process, separate from the technical controls.
Compliance Mapping
Match controls to frameworks:
| Framework | Control | Supply chain answer |
|---|---|---|
| SOC 2 CC6.6 | Logical access | Signed images + admission policy |
| SOC 2 CC8.1 | Change management | SLSA provenance + GitOps |
| NIST 800-218 PS.3.1 | Archive software | SBOM + immutable tags + signed |
| NIST 800-218 PW.4.1 | Approved deps | Lock files + scan in CI |
| EU CRA | Vulnerability handling | Dependency-Track + VEX + patch SLA |
| EO 14028 | SBOM for federal | Syft-generated SBOM, signed, retained |
A single supply chain practice answers many controls. Document the mapping; auditors are grateful.
What to Bypass and What Not To
The pipeline must remain operable under stress. Decide upfront:
- Emergency hotfix: still goes through CI + signing, even if expedited.
- Off-hours deploy without approver: explicit break-glass procedure, logged, reviewed next business day.
- Signing infrastructure down: cache last good signatures; don't route around signing entirely.
- Sigstore public infrastructure down: most teams run their own Rekor/Fulcio mirror or have a vendor fallback.
Decide before you need to. "We'll figure it out when it breaks" produces unsigned production deploys.
Common Pitfalls
Theater without enforcement. Signing every image, but admission doesn't verify. Verify before celebrating.
Identity not pinned correctly. Cosign verifies "an image is signed" but doesn't check who. Always pin to specific OIDC issuer + identity regexp.
SBOM stale or wrong tool. Building a fresh SBOM is fast; using one from last quarter for today's image is meaningless.
Block on every CVE. Critical & high might block; medium should rate-limit; low should track. Otherwise you block production for cosmetic issues.
Forgetting base images. Your code is clean. The base image has 30 CVEs. Plan base image refresh as a continuous task.
Treating signing as a one-time effort. Signing infrastructure breaks (cert expiry, registry change, etc.). Monitor signing success rate the way you monitor build success rate.
Hidden direct-pulls. A CI step that does pip install from PyPI bypasses your proxy and signing chain. Audit the actual network calls.
Verification-only in CI, not on deploy. Verifying in CI proves the build is clean. Verifying at admission proves what's deployed is clean. Do both.
Continuous Improvement
The threat landscape evolves. Treat supply chain security as a continuous practice:
- Quarterly threat model review
- Pen tests targeting the build pipeline
- Subscribe to OSV, GitHub Security advisories
- Monitor Sigstore project security advisories
- Keep tooling (cosign, syft, grype) updated; old versions miss new vuln formats
Checklist
Supply chain security production readiness:
- Every production image is signed
- Admission policy verifies signatures (Kyverno / Sigstore Policy Controller)
- Identity expectations pinned (OIDC issuer + identity regex)
- SBOM generated per build, attached as attestation
- SBOM hub (Dependency-Track / Anchore) ingesting all SBOMs
- Vulnerability scan in CI; critical/high blocks merge
- VEX statements for accepted-risk CVEs
- SLSA provenance attestations generated (Level 2+)
- Caching proxy for upstream registries
- Lockfile + integrity hashes for all package managers
- CI: ephemeral runners, OIDC, pinned Actions by SHA
- Keyless signing or HSM-backed keys (no long-lived plaintext keys)
- Registry: immutable tags, scanning, replication
- Documented incident response: "we found CVE X, how do we know who's affected?"
- Quarterly threat model review and metric review
What's Next
You have a supply chain practice. Connect it to:
- CI/CD — signing belongs in the pipeline
- Policy as Code — admission policies enforce signatures
- GitOps — Git is the trusted source; signatures bind artifacts to source
- Secrets — Vault holds CI tokens and signing keys (if any)
- Internal Developer Platforms — golden paths emit signed-by-default services