I was inspired to write this blog as I have seen many AWS accounts that have been put 'Under Review' due to not meeting Healthy Email Reputation standards, or on the rare occasion where credentials have been leaked and end up in the hands of a phishing ‘spam bot’, wreaking havoc on SES Reputation.
One option to keep your SES Reputation healthy is to manually create AWS Alarms to monitor your reputation - but who wants to do things manually?
In my opinion, where there is a manual task, there is a way for it to be automated!
In this blog, I will be describing both manual and automated methods to solve this problem.
Additionally, I will not only list techniques that can be implemented to keep your SES Sender Reputation healthy but will (hopefully) illustrate the ever-growing importance of adopting an automation mindset and DevOps practices.
These techniques will include examples of what you would see in a typical CloudOps environment, and since no DevOps environment is ‘typical’, we will be implementing my take on a 'DevOps solution' to this problem.
What is Amazon Simple Email Service?
SES is an email platform within AWS Cloud that provides an easy, cost-effective way for you to send and receive email using your own email addresses and domains.
How do i start utilising and setup SES in my account?
When setting up SES for the first time, you will have to authorise either a domain / subdomain or specific email address to send from.
After one of these 'identities' have been validated, you will need to assign the ‘ses:SendEmail’ permission to either the AWS Service Role or IAM User you will be using to send emails from.
If your instance or service have a static IP (Elastic IP in AWS terms), you can restrict the sending from the specific IP in the IAM policy. This allows you to have an additional layer of security if your IAM credentials are compromised.
You can review the ‘Setting up Amazon Simple Email Service‘ documentation to get started.
Why is it important to have a healthy Sender Reputation?
AWS tracks two main metrics that can affect your SES Health; Complaint and Bounceback emails. If these metrics frequently breach a certain threshold, your account may be placed under review and potentially have sending blocked if there is no action.
An account under review
Account Dashboard
Reputation Metrics
CloudOps
Standard Environment
Improved Environment
Generally, to keep your SES Reputation healthy, you should be utilising CloudWatch alarms that monitor the following statistics to sending Warning and Critical alerts when we meet certain thresholds:
-
Bounce Count – AWS requires you to be below a threshold of 10%
-
Complaint Count – AWS requires you to be below a threshold of 0.5%
-
Send Count – As your AWS account will have a set ‘Daily Quota’ limit per region.
If you are close to hitting your daily send limit, you can request an increase by submitting an AWS Support ‘Service Limit Increase’ request.
General rule of thumb is to give yourself a buffer, so if you do go over your average send, you aren’t in a position where you aren’t able to send emails out until the next day or service limit increase has been actioned. (I have seen customers in this situation!)
Creating Alarms for Monitoring
You can create the alarms manually using the console, but in this example we will do the same thing but using the AWS CLI.
Let’s try to create the following Alarms:
- Daily Sending Count: Warning: 60% Alert: 80%
- Daily Bounce Count: Warning: 5% Alert: 10%
- Daily Complaint Count: Warning: 0.1% Alert: 0.5%
Steps
- Retrieve your daily sending quota rate
- View available SES CloudWatch Metrics
- Create Alarms
Daily Send Count
Daily Bounceback Count
Daily Complaint Count
Environmental Result
Your environment now not only has the Bounceback and Complaint topics, but alarms that will alert you when you are in a ‘Warning’ or ‘Alert’ threshold for your total daily sends and daily complaints / bouncebacks.
DevOps
There is no ‘standard’ DevOps Environment, so in this example we will attempt to do the following:
- Create the base alarms using Infrastructure as Code (Using AWS CDK)
- Create a Lambda Function (Which will be deployed via CDK) that will triggered by the SES ‘Success and Bounceback SNS Topics). The purpose of the Lambda function will be to:
- Integrate with your application database to ensure that emails sent are to verified accounts.
- Log all emails sent to the database under the ’email_logs’ table
- Monitor and log all sent emails for a potential instance or credential breach, eg if a large amount of emails are sent to addresses not within the database in a short time, we can alert the CloudOps + Cyber Security teams and temporarily disable SES sending within the account.
- Automate updating the CloudWatch alarm % values if an increase in the Daily Send Quota is detected.
You can view / deploy the CDK Stack example from GitHub.
What does the Lambda Function do?
The Lambda function will:
- Check that the email exists within the database
- If the email is registered to a user, we need to log the details of the email including the result.
- If the email is not registered to a user, increase the value of a custom metric ‘EmailNotRegisteredCount’ and send an alert if this goes above a certain threshold.
Database Tables
users
- user_id
- email_address
- validated
user_email_log
- user_email_id
- user_id
- aws_sns_message_id
- source_address
- subject
- source_ip
- source_arn
- timestamp
- status
- diagnostic_code
unvalidated_email_log
- unvalidated_email_log_id
- aws_sns_message_id
- source_address
- destination_address
- subject
- source_ip
- source_arn
- timestamp
- status
- diagnostic_code
For simplicity, we will exclude some of the irrelevant columns for these scenarios.
Scenarios:
Scenario 1:
- 3 users have registered to our application, and we have sent out 1 validation email each
- 2 users have validated, and 1 has not due to a bounceback
users
user_email_log
Summary:
We have logged our 3 email sends, 1 resulting in a bounce, which has been logged in the ‘user_email_log’ table under ‘user_email_id’ 3
Scenario 2:
- 3 users have registered to our application and are already validated.
- Over the course of a month, we are going to send out 5 marketing emails
- The user ‘user3@example.com’ has a full mailbox
users
user_email_log
Summary:
We can see that over the last month, there have been 5 bounceback emails to user3@example.com
What are benefits of logging successful and bounceback emails for validated users?
- Having an audit trail for ’email sends’ is good to have in any application. If auditing or troubleshooting is required you can query without having to search through SNS Topic event logs.
- If a user signed up with a validated email which can no longer be reached (DNS issues, full mailbox or no longer exists) your bouncebacks will be logged with the diagnostic code, which can be reviewed when monitoring your SES Reputation.
How can we stop sending to this email until it has been revalidated?
You can run queries against the ‘user_email_log‘ table to stop sending emails to your users who signed up with a validated email, but has had 5 or more bounces in the last month:
- Retrieve list of users who have had 5 or more bounces in the last 30 days
Result:
- Set 'validated' to 0 for these users
Result:
Summary:
Our query result shows that user3@example.com has had 5 bouncebacks over the last month, and has invalidated the email by setting the validated column to 0.
Scenario 3:
- Our instance or credentials have been compromised by a hacker who has sent out 10 phishing emails before we disabled the credentials
- We have logged 10 email sends that were sent to addresses not within the database, the Lambda function has sent 10 data points to our ‘Email Sends Not in Database’ metric, which has triggered our warning alarm:
unvalidated_email_log
10 emails have been sent to users not within our database, and thus have triggered our custom alarm. (Sent via SMS)
What does it mean when my SES identity is sending emails to addresses that aren’t within my application?
If you are using hard-coded IAM Credentials in your application, they are likely compromised. (Review the ‘source_ip’ column to confirm this, if the IP is not associated with your infrastructure you should disable your credentials and rotate them once the leak source has been confirmed) If you are sending from within an EC2 instance that is using an EC2 Role, the instance is likely compromised. Or, it could be a team member innocently testing SES in a live environment. If your ‘Email Sends Not In Database’ alarms go off, and there is a genuine concern, you can disable SES Sending on the account either via;
A Lambda function that is triggered through the Alarm via SNS Topic, which runs the ‘update_account_sending_enabled’ function via the SDK AWS CLI command:
Final Result / Alarms:
In our month of sending emails, we also triggered the following alarms:
Additional alarms have been sent out throughout testing (Sent via SMS)
I hope this post has provided insight on how to keep your SES Sender Reputation healthy, and additionally the use of automation / DevOps techniques within an AWS environment.