Server hardening, audit-grade vs reality
The CIS Benchmarks for Linux are 200+ pages. Most of it is genuinely useful; some of it is dated or counterproductive in modern containerised environments. Below is the prioritised list.Layer 1 — the must-do
- No password SSH; key auth only
- No root SSH; sudo with logging
- Default-deny firewall, on-host
- Patches automatic, reboots scheduled
- Disk encryption where physical access is plausible
Without these, the rest doesn't matter.
Layer 2 — process and runtime
- AppArmor / SELinux — enabled, in enforcing mode
- Namespaces / cgroups — for any service running unprivileged user-space code. Containerisation gives this for free
- capabilities — drop all, add only what's needed
- seccomp — syscall filtering. Default profiles from runc / podman cover most cases
- No-new-privs flag — defangs setuid attacks
Layer 3 — supply chain
Every binary on the production server got there from somewhere. The chain:- OS packages from upstream repos with signature verification
- Container images from a private registry, signed (cosign / sigstore)
- Application binaries from your own CI, signed
- Build provenance recorded (SLSA, in-toto)
This is the area attackers increasingly target — supply chain compromise via a dependency, a compromised maintainer, a typosquat package.
Layer 4 — runtime detection
- Falco — kernel-level rule-based behavioural detection. Open-source, mature
- Tetragon — eBPF-based, similar role, more performant
- auditd — kernel-level audit logging. Lower-level signal, requires interpretation
Layer 5 — segmentation
- Network segmentation — production servers in a VPC / subnet that's not directly internet-reachable
- Bastion / jump-box for admin access, with strong authentication
- Service-to-service auth — every internal call authenticated
- Database isolation — only application servers reach the DB
Things often included that don't move the needle much
- Disabling unused TCP / IPv6 features
- Custom hardening of /tmp permissions
- Aggressive sysctl tweaks (some are dated)
- Hiding kernel version banners
If you have a CIS Benchmark mandate from compliance, run the script. If not, prioritise the layers above.
Container hardening
- Read-only root filesystem
- Drop all capabilities; add only what's needed
- Run as non-root
- Resource limits (cpu, memory)
- Image scanning at build + at registry pull
- Pin image digest, not :latest
- Distroless or minimal base images
Cloud-specific
- IAM principle-of-least-privilege — most cloud breaches start with over-permissioned IAM
- No long-lived API credentials in code; OIDC federation or short-lived tokens
- VPC flow logs enabled
- CloudTrail / Cloud Audit Logs / Activity Logs enabled with retention
- No publicly-readable storage buckets unless explicitly intentional
The "compromise" tabletop
Every six months, run a tabletop exercise:- "An attacker has shell on one production server. What can they reach?"
- "An attacker has compromised one IAM user. What can they do?"
- "An attacker has a developer's laptop. What credentials do they have?"
One pattern we'd warn about
Hardening that the team can't operate. A locked-down box that engineers can't troubleshoot is a box that gets unlocked during the next incident.One pattern that always pays off
Documenting what's hardened and how to interact with it safely. Saves hours during incidents.What hardening item has actually saved you in a real incident?