Each company has its own journey to the cloud. Some companies are born in the cloud; others migrate there over time. Whether your company is cloud-native or a cloud-newbie, your AWS journey has probably gone something like this:
Feel familiar? Don’t despair! You can fix it with some time and effort. It’s all about getting the right foundation in place on AWS. This is definitely a case where an ounce of prevention is worth a pound of cure—or, more practically, hundreds of hours fixing stuff.
While there are many things you can do to set yourself up for success, here are five must-dos to get the foundation right the first time.
Plan your IP space
It is really easy to change/remediate most things on AWS. VPC CIDR ranges, however, are the exception. Once you’ve provisioned a VPC, you cannot change its CIDR range (i.e. 10.0.0.0/16 or 126.96.36.199/16, etc); you have to delete it and create a new one. If you have resources in that VPC (say, EC2 or RDS instances), you can’t delete it. The upshot? If you don’t plan ahead, you can find yourself in trouble.
To avoid these problems, take time to map out all of your IP spaces. Each VPC needs a CIDR range that doesn’t overlap with other VPCs or your on-prem network. While it is possible to have networks with the same CIDR ranges talk to each other, it gets expensive and adds additional complexity to your networking layer. This is one headache you really want to avoid; do yourself a favor and avoid overlapping IP ranges from the beginning.
HERE is a great article that will help you properly break up your network.
When you first start playing on AWS and you have 10 or so servers, it’s pretty easy to keep track of which servers are which—which servers are dev machines, who provisioned them, which environment they are part of, etc., etc. When there are 10,000 servers, it’s not quite as easy (read: impossible). This governance nightmare will not only run up your AWS bill, but will also cause your team endless heartburn. Who should I talk to when something needs to be addressed? Who owns which resources? Who should be paying for what? Is this my production DB or my testing DB? These are the day-to-day questions that tags and tagging strategies answer.
Our recommendation is to develop a tagging standard early on and stick to it. There are a myriad of different tagging strategies you can employ; which one you choose doesn’t matter as long as you’re tagging everything in a way that’s meaningful for you and your company. This article documents some tagging strategies. Here are some standard tags that show up in most strategies:
- Cost Center
- Application name
- Data Classification
Additionally, pick a standard naming convention for your AWS resources. Common naming conventions are:
- TeamNameAppNameResourceName (PascalCase)
- team-application-resource-name (kebab-case)
- application_purpose_environment (snake-case)
For your sanity, please do not name your resources This_is-aTerriblenaming/Abomination. Seriously, DO NOT DO THIS.
Grant appropriate permissions
I’m sure you, dear reader, would never accidentally break a system you shouldn’t have had access to in the first place, but it happens to other people all the time. In no particular order, I’ve seen lots of accidents happen:
- Production servers torn down
- VPNs that connect back to on-prem deleted (resulting in half of the environment going down)
- Running scripts not using the correct –profile (i.e. pushing dev code to prod instead of test)
- Terminating the wrong server (which belonged to an entirely different team, oops)
- A new team-member mucking around with services they don’t understand
When it comes to AWS permissions, you want to be smart, not draconian. Except in production—by all means, be draconian in prod. Here are some examples:
- Use AWS Organizations: AWS Organizations is a service that helps wrangle permissions and services across all of your AWS accounts. For example, you can use Service Control Policies (SCP) to limit the permissions of entire AWS accounts. If you don’t want people creating IAM users willy-nilly in a certain account, create an SCP that denies creation of IAM users for a given account.
- Provide guard rails: It’s ok to give developers freedom to explore, however that doesn’t mean giving them free reign. Below is an example of an IAM policy that allows developers the freedom to create EC2 instances, but only the cheap ones.
Using simple, thoughtful policies (like the one above) gives your team the freedom to solidify their AWS skills, while preventing things from quickly going off the rails.
Another common guardrail is to keep people in selected AWS regions. Besides being convenient for developers, it means you don’t have to go hunting across all the regions to find resources. Use a policy like this:
- Grant access based on job function: reuse permission sets based on the tasks someone needs to accomplish on AWS. Start with the AWS managed job-function IAM policies, such as:
Whatever you do, don’t just give everyone AdministratorAccess. You won’t remember to reign it in later, and you will end up with a mess on your hands.
Separate what should be separate
Whether it’s your dev/stage/prod environments, PCI/HIPAA applications, or development teams, separate what needs to be separated. Having clean lines lessens the permissioning and governance burden on your organization. Don’t want to write complex IAM policies preventing teams from stepping on each other’s resources? Don’t want to write complex firewall rules? Don’t want to have to dev applications to be able to access production databases? Don’t want to have your PCI auditors have to go through your entire infrastructure? Separate them!
There are many ways to go about this. Some companies use different VPCs, others use different AWS regions or accounts. However, if you go about separating resources, you’ll encounter some tradeoffs. In general, the more isolated your resources are, the more complicated your tooling and monitoring becomes. These graphs show this relationship:
It’s important however, to not let these tradeoffs keep you from separating your resources. As the old adage goes, “good fences make good neighbors.”
The best set of cloud maxims I’ve ever heard was from a 2015 AWS re:Invent talk by Soofi Safavi:
- If it moves, measure it
- If it’s not monitored, it doesn’t exist
- If it’s not automated, it’s not finished
The last point is the one I want to focus on. Turns out that computers are more consistent than people, don’t require sleep, and have a much faster response time. If you really want to harness the power of AWS, automation is where the money is. You can automate just about everything and you SHOULD!
Example 1: Automatically tag your EC2 instances
There are lots of different ways to automatically tag AWS resources. One approach is by using CloudFormation. Here, we run a command to tag every resource in a CloudFormation stack with a single –tag flag:
aws cloudformation deploy –template-file /path_to_template/template.json –stack-name my-new-stack –parameter-overrides Key1=Value1 Key2=Value2 –tags Key1=Value1 Key2=Value2
Another approach is to create your own Lambda function to tag EC2 instances. Then, set that Lambda to be triggered by a CloudWatch event that fires when an EC2 instance is created. For instructions on how to do this, check out this blog.
Don’t want to have to write a bunch of Lambda functions by hand? Use an automation framework like Cloud Custodian. Cloud Custodian provides a layer of abstraction over Lambda functions, CloudWatch event rules, AWS Config rules, and more. You specify your environment’s “rules” (tagging strategies, instance on-hour/off-hour policies, security remediations) in a yaml format, and Cloud Custodian creates and manages the associated resources for you. For an introduction to Cloud Custodian, check out our demo project.
Example 2: Automate deployments with Continuous Integration and Continuous Delivery and Infrastructure as Code
CI/CD (Continuous Integration and Continuous Delivery/Deployment) is a large topic of its own and deserving of a separate blog. It’s worth touching on here, though, because CI/CD gives you a standard process for creating deployment artifacts (code files, Docker images, and the like), deploying from those artifacts, and promoting those artifacts through different environments.
You can (and should) create CI/CD pipelines to deploy application code and infrastructure code. By using CloudFormation to package your infrastructure as code, you can now use the best practices developers have been using with app code for years: version control, code review, and automating the build/test/deployment process. Applying these practices to your infrastructure gives repeatable, predictable deployments and greatly improves your quality of life.
A Smoother Journey
With a little bit of forethought, patience, and proper governance, your AWS journey can go much more smoothly:
To achieve that smoother AWS journey, start with the five practices above. If you need hands-on assistance with these or other AWS governance practices, reach out to firstname.lastname@example.org and one of our AWS experts will be there to help! In addition, AWS offers several programs which available through 1Strategy; contact us if you think a Well-Architected Review might be right for you.
See related articles:
A Solid AWS Foundation
The Case for Cloud Custodian
CloudWatch Logs Agent Tips & Tricks
Serverless CI/CD Tutorial