My job involves working with customers who are adopting (or increasing their adoption of) the AWS Cloud. Often I encounter confusion about how to do Infrastructure as Code (IaC) well.

It’s not uncommon for developers to have adopted an IaC approach for part of their infrastructure, but to have reverted to clicking in the console to create other infrastructure because they couldn’t figure out how to do everything using IaC.

This article aims to help you to think about IaC in a way that will help you use it to build every part of your application’s infrastructure.

Some background

Infrastructure as Code (IaC) has been around for more than a decade and has its origins in tools like CFEngine, Puppet and Chef. Over time, IaC’s value as the key that unlocks many of the benefits of Cloud computing has been increasingly understood.

google trends (1).png Google Trends result for “Infrastructure as Code” over 5 years to Sept 2020

IaC is the practice of defining the resources needed for an IT system (eg virtual servers, databases, etc) in version-controlled, human and machine-readable files. IaC files can be written in either a markup language (like YAML or JSON) or a programming language (like Python or C#).

When it’s time to create the infrastructure, these IaC files are processed by a system that turns the definitions within them into commands that are issued to the target platform, which create the requested infrastructure. In AWS CloudFormation, this creates what’s referred to as a stack.

cloudformation workflow (1).png The CloudFormation workflow

Everything in a stack is managed as a unit. Typically an application environment in AWS will be built using multiple stacks. Information about the resources created in each stack can be exported for other stacks to refer to if required, eg you can export a VPC ID created in a networking stack so subsequent stacks can create resources in that VPC.

CloudFormation helps us manage IaC projects by doing pre-flight checks for invalid syntax, generating change sets and enabling rollback on failure. CloudFormation linters can be run before you deploy your templates to catch a variety of other problems. These tools help us perform IaC deployments reliably and safely by helping us to avoid errors and mishaps along the way.

When to use IaC

The value of IaC goes beyond giving us a way to conveniently orchestrate the creation of infrastructure. Because IaC is reusable, we can reliably redeploy the same stack whenever it’s needed (think regional expansion or disaster recovery). As it’s version-controlled, changes can be managed (eg pull requests) and audited (eg via tools like git blame).

IaC is self-documenting, as the document that created the infrastructure is also the canonical record of what has been created — much more accurate than asking someone to write down what they did to set up an environment.

IaC gives us a reliable and safe way to continuously evolve our infrastructure. Updated IaC can be used to create change sets that can be reviewed to see the impact of the changes before the change is applied.

For all of the above reasons, you should always use IaC to create any infrastructure you plan to use in a production context. Even for prototyping, IaC can help you by making it easy to tear down your experiments once you’re done.

Get into the habit of always building your infrastructure using IaC and you’ll be setting yourself up for success.

Things to be aware of when using IaC

Developers who are new to IaC are sometimes confused about when and how to use IaC. Often, they only adopt it for parts of their infrastructure and end up clicking around in the console to create the other services they need. This can leave you with a mish-mash of undocumented stuff in your account that’s hard to manage.

You should be especially careful not to use the console to manually modify resources that were created using IaC. Always update them by updating your IaC definition and applying it properly through CloudFormation to avoid drift issues rendering your stack unmanageable.

Some kinds of resource updates will cause the previously created resources to be replaced. This can be a nasty surprise if, for instance, you make a change to an RDS instance and that change requires the existing instance to be replaced. It’s important to be aware of the update behaviours of different resources and understand how to use a DeletionPolicy to protect key resources. Change sets can also help you to manage this risk.

Thinking of your infrastructure in terms of its longevity leads us to IaC modalities.

IaC modalities

So how do we build an holistic approach to IaC? The first step is to understand that IaC works in two key modalities, and that you need therefore to design an approach for each.

The first modality is Persistent IaC. This is used for creating long-running infrastructure (such as databases, queues, networks & IAM roles). You will tend to change these things infrequently once they have been created.

The second is Ephemeral IaC. It is used for building temporary (and preferably immutable) infrastructure (eg virtual machines, containers, serverless functions) that supports the current version of an application. This infrastructure will be created, updated and destroyed regularly.

By having separate stacks for persistent and ephemeral infrastructure, you minimise the “blast radius” of any failed changes you make. It can also help to provide a logical structure to your infrastructure and make it easier to support.

By understanding how to separate your IaC requirements into these two modes, it will be easier for you to pick the right path to 100% IaC coverage for your next project.

Persistent IaC

Persistent IaC defines the infrastructure that is foundational to your account and your applications. It will include shared resources, such as VPCs, KMS keys and IAM roles. It will also typically cover resources such as databases and S3 buckets that will store data throughout the lifetime of your applications. It may define services that are used to move data around, like SQS or messaging, like SNS. Baseline security (eg WAF) and monitoring could logically be established in your persistent IaC stacks.

To build long-running IaC for AWS there are a number of options you could choose, but I recommend using the AWS Cloud Development Kit (CDK) to compose it in a programming language of your choice. The CDK provides a number of high-level constructs that make it quicker and more efficient to build the infrastructure you are likely to need.

Don’t succumb to hyperbolic discounting

For long-running infrastructure, people are sometimes tempted to “avoid the extra work” of defining IaC, thinking the pay-off for something you’ll “only create once” doesn’t justify the effort. This preference for a lesser payoff today compared to one you know will be more valuable later is called hyperbolic discounting, a cognitive bias from which most of us suffer.

console (1).png

Put down that mouse and step away from the console…

If you catch yourself thinking this way, consider this — it’s rare to create infrastructure only once. Test systems, disaster recovery and regional expansion are three cases where being able to spin up the exact same infrastructure more than once will be very useful to you.

In addition, every environment should be evolving over time. Database version upgrades, adding new services, making networking, security or permissions changes — these are all things that can (and should) be safely managed via IaC and you won’t be able to do that if you start off on the wrong foot.

Ephemeral IaC

Ephemeral IaC is the infrastructure that is tied to your application logic, and deployed when you make application changes via your CI/CD pipelines. It’s created this way because it is more likely to be changed in response to application changes, and because it will often create ephemeral infrastructure that is destroyed and recreated with each new deployment.

There are a number of ways to create infrastructure using IaC as part of your application deployment process.

Elastic Beanstalk

The first, and often simplest way to start with Ephemeral IaC on AWS is to use Elastic Beanstalk. This is AWS’s platform as a service (PaaS) offering.

Elastic Beanstalk (EB) provides you with a large number of preconfigured compute environments (Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker) that you can deploy to without having to know much about what’s going on under the hood. EB silently orchestrates your ephemeral application infrastructure for you using pre-defined CloudFormation templates for common configurations.

While the environments have sensible defaults, EB also allows you to make custom configurations for that infrastructure in a safe, best practice way. This is achieved through the use of ebextensions — files that allow you to modify the default environment during instance creation time.

How EB interacts with Persistent IaC stacks

Your persistent IaC should create the instance of Elastic Beanstalk, including defining the networking configuration, security group settings and instance role that should be attached to the EC2 instances that will be spun up. EB will create ephemeral infrastructure and deploy your application package to it as required. If you want to import a value from a persistent IaC stack, you can compose an ebextension file to do that.

SAM (Serverless Application Model)

If you are looking to go Serverless (and you should), take a look at SAM. SAM is a framework for defining serverless applications.

With SAM you can build Lambda functions, DynamoDB tables, API Gateways and more, all as part of your application and all version controlled in the same repository as your application logic.

When you want to deploy your app, SAM expands your SAM-defined infrastructure into standard CloudFormation and creates it as a stack. You can continue to modify any infrastructure resources that are defined as part of your SAM app and CloudFormation will manage the updates too.

How SAM interacts with Persistent IaC stacks

With SAM, all the infrastructure related to your app is managed in the SAM project. Persistent infrastructure you manage via CDK can be referenced in your SAM project by exporting information from your persistent IaC stack and using the ImportValue intrinsic function to reference those resources in your SAM template.

Amplify

Amplify is an open-source mobile/web app framework developed by AWS. It offers simple integrations with services in AWS, such as storage, authentication (Cognito) and others. In each case, AWS does the IaC “heavy lifting” for you by generating CloudFormation to build those services in AWS, and gives you code snippets to integrate them with your web/mobile front end.

Amplify is a self-contained ecosystem and for most users you will never need to touch the CloudFormation it generates. You can, in theory, manually edit the CloudFormation files, but unlike with SAM it’s outside the scope of normal operation and some Amplify CLI operations can overwrite changes.

How Amplify interacts with Persistent IaC stacks

Amplify can use the resources created by your persistent IaC, but this is best done using traditional methods like DNS endpoints rather than trying to do so at the IaC level.

ECS, EKS and Lambda

ECS & EKS enable you to deploy onto ephemeral infrastructure by virtue of being managed container hosting platforms. The IaC in this scenario is baked into the service itself, which will manage the creation and termination of ephemeral resources as required to support your container tasks.

Similarly, Lambda will manage all the ephemeral infrastructure required to run your functions.

How ECS, EKS and Lambda interact with Persistent IaC stacks

ECS and EKS would be defined as persistent infrastructure in a persistent IaC stack. Lambda can be defined similarly, but also via SAM, which might be a better way to go in most cases. In either case, you can import values from other persistent stacks as required.

AWS Code tools: CodePipeline, CodeDeploy

Finally, we have the Code tools in AWS that let us build and deploy our app to AWS infrastructure.

CodeDeploy is a deployment service that can orchestrate the deployment of your code to EC2 instances, ECS, Lambda or even on-prem instances. In AWS, it can orchestrate the creation of ephemeral infrastructure to support blue-green deployments. It uses appspec files to define the commands CodeDeploy will execute to deploy your application at different stages in the deployment lifecycle.

CodePipeline can be used to orchestrate a multi-stage pipeline of steps that can include build, test and deploy stages. There are various CodePipeline deploy actions, including CloudFormation, Elastic Beanstalk, ECS, Service Catalog and CodeDeploy. These are AWS-provided integrations that do the heavy lifting for you to manage the creation of infrastructure within these target services.

How the Code tools interact with Persistent IaC stacks

Using the CloudFormation deploy action in CodePipeline allows you to deploy using CloudFormation that can import values from other persistent infrastructure stacks. Elsewhere, eg in shell scripts that get executed during build or deployment stages in a pipeline, you access exported values using the AWS CLI.

Summary

Understanding how to use Infrastructure as Code to build out your complete environment in AWS is a worthy goal. Hopefully this article has given you some ideas as to how you can logically group your infrastructure using Persistent and Ephemeral IaC modalities.

In addition, I hope a greater understanding of how AWS uses IaC within its own services will help you to glue all the pieces together into a cohesive IaC strategy.