I was recently giving a friend some advice on how to set up a minimal stack for a starter project, and somewhere along the line realized that my MVP was approaching a dozen different AWS services. Time to stop and start reevaluating life decisions, people.
So, I went back to the drawing board, and set up a basic stack with close to the bare minimum. This ended up being a little more complicated than I expected, so I wrapped it all up in some terraform files and put in on GitHub.
Here’s what the stack looks like:
Let’s walk through the different pieces.
Virtual Private Cloud
Before you can do much of anything else in AWS, you’ll need to set up a VPC. For this, you’ll need a subnet, an internet gateway, and a route table. All of this put together amounts to a big virtual router for all of your servers, and a pipe to the outside world (with
iptables thrown into the bargain).
At every step, we want to restrict access to resources as much as possible. So, we only allow
ssh from whitelisted ip addresses, http(s) requests to 443 and 80 on the load balancer, and http requests forwarded from the load balancer to ports 80 and 81 on the web server (this last is a bit of sleight of hand to differentiate between requests with and without SSL – it can be avoided, but I find that it’s more straightforward to do it this way than to forward everything to port 80 and depend on headers to differentiate).
Identity and Access Management
IAM is one of the subtler pieces of AWS’s infrastructure. You use it to allow servers and users different levels of access to each of your resources. So, for instance, the release server can upload tarballs to the release S3 bucket, and the application servers (and only the application servers) can download them. You can also use IAM and S3 to make a secret (e.g., puppet key, database credentials) available to a limited set of servers, without having to store it in a source code repository (which might need to be accessible to a wider audience). In this case, we use IAM to manage access to the various S3 buckets, and assign roles on a server-by-server basis.
We use CloudFront and an S3 bucket to create a dead simple CDN. Just put your files into the S3 bucket with the correct permissions, and they’re available to the world.
EC2 Instances / Load Balancer
Although we don’t have multiple EC2 instances behind the load balancer, it’s useful to set one up from the start for three reasons. First, it allows us to terminate SSL, which saves us some annoying configuration (and processor time) on the application server. Second, when we do want to add additional EC2 instances later, we won’t have to futz around with migrating over to the load balancer. Lastly, this gives us a chance to see how to set up DNS correctly, which is a little different for load balancers.
In addition to the web server, I’ve included a release server config to demonstrate how you can set up a machine with an identical build environment and an S3 bucket for managing build artifacts.
This stack makes a reasonably good MVP, but there are plenty more things you’ll want as your system evolves.
- Release script. The first thing you’ll need is a script to automate the build/deploy cycle. You want this to be 100% automated right off the bat. Speaking of which…
- CI server. Whether you choose Jenkins or something else, you’re going to want to set up continuous integration sooner rather than later. You should also set up at least one node, and start version controlling your configuration from the beginning. If you use Jenkins, my experience is that you’ll want at least
- Configuration management. Whether you choose to use Puppet, Chef (through OpsWorks or on your own), Ansible, Salt, or something more exotic, getting configuration management set up is a major milestone in your site’s evolution. This will also let you get rid of the “remote-exec” code in the terraform scripts.
- Auto-scaling. Adding an auto-scaling group is easy, but will make instance set-up a little more complicated (not much). This would be a good preparatory step for moving to configuration management.
Final word (kind of)
There’s a certain sense of power you get in having your infrastructure’s configuration laid out in terraform – you can create and nuke your infrastructure at will, completely removing the uncertainty inherent in hand-tuned stacks (“am I forgetting a step?”). In the past, I found it extremely easy to generate stacks using CloudFormation – as I gain experience with terraform, I’m curious to see whether I’ll settle on it for managing everything, or if it’ll make more sense to use it for relatively static elements (VPC, subnets, internet gateway, routing tables, security groups, IAM roles and policies, S3 buckets), and rely on CloudFormation for more dynamic pieces (ELB, auto-scaling groups, instances). It’s a neat tool – I’m looking forward to playing around with it some more.
Thanks to Ted Haining for pointing out that one of the subnets was unnecessary!
Picking up a comment or two on Reddit. I mentioned that it would be best to comment here, not there.