The IaC question, untangled
"What infrastructure-as-code should we use" is asked as one question but it's actually three: which provisioning tool (creates resources), which configuration tool (configures already-created resources), and which orchestration tool (sequences both). The wars over Terraform vs Pulumi vs CloudFormation are mostly the first; the wars over Ansible vs Chef vs Puppet are the second.Below is the practical structure.
Provisioning: Terraform, Pulumi, OpenTofu, CloudFormation
- Terraform (HCL) — the de-facto industry standard, broad provider coverage, mature ecosystem. After the licence change, OpenTofu (the open-source fork) is increasingly the choice for teams uncomfortable with HashiCorp's BSL.
- OpenTofu — drop-in Terraform replacement, same configuration language, now governed by the Linux Foundation. The path forward for many teams.
- Pulumi — IaC in real programming languages (Python, TypeScript, Go). For teams that find HCL constraining, especially with complex logic. State / provider model similar to Terraform.
- AWS CloudFormation / CDK — AWS-native, deep integration. CDK lets you write CloudFormation in real languages. Use when AWS-only and the team prefers AWS-native patterns.
- Azure Bicep / ARM templates — same idea for Azure.
- Crossplane — declares cloud resources via Kubernetes CRDs. Niche but increasingly relevant in k8s-centric shops.
The pick mostly tracks the team's preference and existing investment. None of them is wrong; all of them have caveats.
Configuration: Ansible, Salt, Chef, Puppet
- Ansible — the lightweight default. Agentless, YAML-based, broad module coverage. Most "configure these servers" tasks fit.
- Salt — fast at scale, pull or push model. Less common in 2026 than its peak.
- Chef / Puppet — the older agent-based options. Still common in enterprise; less common in greenfield.
For most modern infrastructure (containerised, immutable infrastructure), the configuration management layer is shrinking. The OS image is built once, the application is in a container, the "configure existing servers" use case is mostly bootstrap + the rare day-2 task. Ansible covers it.
The state file: the thing that surprises new teams
Terraform / OpenTofu / Pulumi all maintain state. State is:- Sensitive (contains resource configurations including some secrets in some providers)
- Concurrent (multiple developers can apply against it; lock or chaos)
- Versioned (you want to be able to recover from a corrupted state)
Default state on local disk is wrong for any production use. Patterns that work:
- Terraform Cloud / HCP — managed state with locking and audit
- Self-hosted backend (S3 + DynamoDB lock, GCS, Azure Storage) — common, well-documented, requires careful setup
- Spacelift / env0 / Terragrunt — workflow tools that wrap state management
Modular architecture
A flat IaC repo at scale becomes unmaintainable. The pattern:- Modules — reusable units, each with a clear interface (inputs, outputs, side-effects)
- Stacks / environments — instances of modules per environment
- Composition — environment configs reference modules at pinned versions
- Module testing — Terratest or equivalent. Modules are software, they need tests.
The 30-environment monorepo with one main.tf is the anti-pattern most teams have. Refactoring out of it is a meaningful project.
Drift: the leak that becomes a flood
"Drift" = the actual cloud state diverges from what IaC says it should be. Causes:- Manual console changes by an engineer in a hurry
- External automation (e.g. autoscaling) that the IaC doesn't model
- Provider bugs that miscompute the diff
The discipline:
- Plan + apply via pipelines, not local. Local applies are auditable only if the engineer remembers.
- Drift detection job — runs nightly, alerts on diffs.
- Console access tightly limited; "break-glass" only.
- Anything autoscaling or cloud-managed is documented in the IaC as ignored attributes.
Secret handling in IaC
- Don't put secrets in tfvars or HCL files committed to git
- Pull from secret managers (AWS Secrets Manager, Vault, GCP Secret Manager) at apply time
- The state file does contain secrets — the backend storage must be encrypted at rest and access-controlled
- Rotate secrets, including provider credentials. The "static credentials in CI" pattern is increasingly being replaced with OIDC-federated short-lived credentials. Adopt it.
One pattern we'd warn about
"Code golf" in HCL or Pulumi. A clever 5-line generator producing 50 resources is a debugging nightmare. Verbose, explicit, boring IaC is the maintainable IaC.One pattern that pays off
Versioning modules properly. Tag releases of internal modules. Environments pin to specific versions. Upgrades are deliberate, not "whoever ran apply most recently".What's your IaC stack? And — for the OpenTofu folks — has the migration from Terraform been smoother than expected, or are there gotchas to know?