Automate all the things! -Every (Tech)Ops Person, ever
Most sane people will think twice about automating an incredibly destructive process. It’s common, however, for businesses to have a need for ‘cleansing’ their AWS Accounts—especially considering the resources engineers are constantly provisioning. They’re creating roles, testing automation, teaching themselves about a new service, or working on any number of other activities. It’s easy to forget just how many resources are created and how quickly those resources are forgotten, hitting the bottom line of the business.
This post will talk through how to utilize AWS-Nuke in an AWS CodeBuild Project, scheduled to run automatically via AWS CloudWatch Events. It’s also worth noting that this whole project is defined as an AWS CloudFormation template, which means it is:
- Quick to provision
- Reusable
- Configurable
- Easily managed with version control!
If you’d like to skip all the text and get right to the goods, complete code and provisioning instructions can be found in our Github Repo.
The Tools
AWS-Nuke
AWS-Nuke is a powerful tool which allows you the ability to programmatically destroy any resources in an AWS Account which are not considered “Default” or “AWS-Managed.” In short, it will take your account back to Day 1 with few exceptions. If you’re not shaking a little bit out of fear, take a step back and consider what it would cost your business to accidentally run this on your Production account. It is always best to be very aware of which account you’re working with, and any resources in that account that should be retained (whitelisted).
At its center, AWS-Nuke is a series of programmatic API calls to AWS Resources in your account(s) on your behalf. You can review the full list of supported resources using the aws-nuke resource-types
command in your terminal to view locally on your machine. If your business uses an obscure or newer AWS offering, it’s worth keeping an eye on what’s currently supported and what’s not—the contributors of AWS-Nuke tend to stay current, but there will likely be a lag in implementing new service offerings.
Additionally, AWS-Nuke requires configuration using a YAML file. This file allows you to identify (whitelist) specific resource types to exclude from the cleansing process in part, via filters, or completely. For example, your organization may have a master account to which all users belong, and another sandbox account where users assume a role to develop, test, and learn. When cleaning the sandbox account, for example, you may want to filter that role using the configuration file to avoid eliminating permissions for your users to work in the Sandbox. Filters are versatile, and you can learn more about them in the documentation on GitHub.
It’s important to spend time defining and refining this configuration file to suit your purposes. Running the automation, checking logs, updating the filters, and then verifying your updates should be done as many times as necessary before you allow the automation to run with the ability to destroy resources.
AWS CodeBuild and CloudWatch Events
AWS CodeBuild and AWS CloudWatch Events are two separate, well-integrated tools within the AWS ecosystem that will enable us to automate this whole process and set it to run on a regular schedule without having a dedicated, always-on, box configured with a scheduled job. The scheduled event will trigger a build, much like a deployment pipeline build, to kick off the nuke process within each account.
The Script
The bash script is at the core of the automation for this project. The first few actions to implement are necessary for setting up the environment for our nuke process. We’re using JQ to parse the JSON response data in the command line, downloading the aws-nuke binary from GitHub, and configuring the aws-nuke binary as an executable. Once we have our tools in place, we move onto retrieving the account numbers from the Organizational Unit in which the accounts are housed—this portion of the script is likely where customization will occur to suit other problem domains, so having a firm understanding of your account structure and specifically how to target your accounts for clean-up is critically important to the automation. The last step, before stepping through the accounts and cleaning, is to retrieve the aws-nuke-config.yaml
file from S3. Note: You will need to modify that config file yourself and store it in your own S3 Bucket.
With configuration complete we step into a scripted loop. For each account number retrieved we will:
- Assume a role within that account
- Cache the AWS Credentials for that role
- Update a copy of the
aws-nuke-config.yaml
with the current account number - Run the nuke command on that account
apt-get install jq
wget https://github.com/rebuy-de/aws-nuke/releases/download/v2.10.0/aws-nuke-v2.10.0-linux-amd64
mv aws-nuke-v2.10.0-linux-amd64 /bin/aws-nuke
chmod +x /bin/aws-nuke
aws organizations list-accounts-for-parent --parent-id ${ParentOuId} |jq -r '.Accounts |map(.Id) |join("\n")' |tee -a accounts.txt
aws s3 cp s3://${BucketName}/aws-nuke-config.yaml .
while read -r line; do
echo "Assuming Role for Account $line";
aws sts assume-role --role-arn arn:aws:iam::$line:role/${AssumeRoleName} --role-session-name account-$line --query "Credentials" > $line.json;
cat $line.json
ACCESS_KEY_ID=$(cat $line.json |jq -r .AccessKeyId);
SECRET_ACCESS_KEY=$(cat $line.json |jq -r .SecretAccessKey);
SESSION_TOKEN=$(cat $line.json |jq -r .SessionToken);
cp aws-nuke-config.yaml $line.yaml;
sed -i -e "s/000000000000/$line/g" $line.yaml;
echo "Configured aws-nuke-config.yaml";
echo "Running Nuke on Account $line";
# NOTE: Add --no-dry-run flag for Production
aws-nuke -c $line.yaml --force --access-key-id $ACCESS_KEY_ID --secret-access-key $SECRET_ACCESS_KEY --session-token $SESSION_TOKEN |tee -a aws-nuke.log;
nuke_pid=$!;
wait $nuke_pid;
done < accounts.txt
echo "Completed Nuke Process for all accounts"
cat aws-nuke.log
I’ve chosen to pipe the output for all accounts to a single log file, and then print that log file at the end of the process. Note: Each account you intend to clean will need a Role defined that provides broad access (essentially administrator access if you’re feeling lazy, but I would recommend following the for security purposes).
Dry Runs vs. Production
By default, this script will not take any destructive action on any resources in your account(s). It will provide a log of the “dry run” output as if it actually completed the actions specified. When you’ve thoroughly tested this and whitelisted any resources in your own aws-nuke-config.yaml
, you need to add the –-no-dry-run
flag to the aws-nuke
command in this script to force a destructive run.
The Project
CodeBuild provides a super-handy way for us to spin up a container and run a script without having to worry about provisioning and maintaining resources. We’re using CloudFormation to define the whole project, so the code sample below displays the bulk of the Project resource configuration for a CloudFormation template. In short, we’re building a standard AWS Linux Docker container, configuring the log output channel, assigning an AWS Role to write logs and assume the account roles (mentioned earlier), and defining source code (the bash script) for runtime.
NukeScriptProject:
Type: AWS::CodeBuild::Project
Properties:
Artifacts:
Type: NO_ARTIFACTS
BadgeEnabled: false
Description: Builds a container to run AWS-Nuke for all accounts within the specified OU
Environment:
ComputeType: BUILD_GENERAL1_SMALL
Image: aws/codebuild/docker:18.09.0
ImagePullCredentialsType: CODEBUILD
PrivilegedMode: true
Type: LINUX_CONTAINER
LogsConfig:
CloudWatchLogs:
GroupName: !Sub "AccountNuker-${AWS::StackName}"
Status: ENABLED
Name: !Sub "AccountNuker-${AWS::StackName}"
ServiceRole: !GetAtt NukeScriptProjectRole.Arn
Source:
...
...
The Schedule
AWS CloudWatch Events provides both event-based and scheduled triggers for automated actions in other services. This is basically a fancy Cron job to schedule the build project. As with the CodeBuild Project, the example below is a resource defined within a CloudFormation template. In short, it defines the schedule expression for the Cron job (12:00a PST Mon-Fri), a role to allow the event trigger to run the build project, and a target for the trigger—the build project resource we reviewed above.
CloudWatchNukeScriptSchedule:
Type: AWS::Events::Rule
Properties:
Name: !Sub NukeScriptCloudWatchSchedule-${AWS::StackName}
Description: Scheduled Event for running AWS Nuke on all accounts within the specified OU
ScheduleExpression: cron(0 7 ? * 1-5 *)
State: ENABLED
RoleArn: !GetAtt CloudWatchNukeScriptScheduleRole.Arn
Targets:
-
Arn: !GetAtt NukeScriptProject.Arn
RoleArn: !GetAtt CloudWatchNukeScriptScheduleRole.Arn
Id: NukeScriptId
Outcomes
Depending on the state of your accounts, your ROI may be substantial or minimal. However, this mechanism provides a way of minimizing opportunities for your engineers to provision expensive resources over long periods of time that cost the business unnecessarily. Additionally, this keeps your accounts clean and forces procedural organization of resources that should be whitelisted within the accounts!
We’re here to help, too! If you are interested in learning about how 1Strategy can help you optimize security, manage your resources, or control costs on AWS, we’re just an email away at info@1strategy.com