AWS Macie is a tool developed by Amazon that can automatically scan log files, CloudWatch events and other information streams inside AWS and detect anomalies, attacks and information leaks. While Macie does come with some caveats (below), our experiences so far have been encouraging and we recommend considering Macie as part of your security tooling.
Macie uses machine learning with an Amazon generated model to scan the information streams to determine threats to your sensitive data.
This tool comes with some considerable advantages:
- Very easy to set up and use.
- Monitoring/alerting performed through CloudWatch alarms.
- Easy to import your own application log data as well as AWS data (for things like IAM privilege escalations).
- Very easy to drill down to understand what issues Macie has found.
- Extremely powerful at detecting leakage of sensitive information such as individual's names, addresses, dates of birth, card information, and similar.
Macie is a very new tool though, so has some real disadvantages that make using it long-term (currently) frustrating:
- There is no easy way to export all alerts found into different systems; if alerts need to be exported, they must be done in batches of 500 to a CSV file using Macie's interface. Alternatively, you can have Macie point all alerts to CloudWatch alarms on your behalf. Macie can export to incidents inside AWS Security Console (which does have an export function via AWS CLI), but Security Console incidents typically contain less information than Macie alerts.
- No automatic scanning of CloudWatch logs; logs must be exported to S3 and then scanned.
- Because Macie is an AI tool, if it detects something it considers PII or unusual behaviour, it will tell you that it has found it, but not how it found it. This is especially frustrating if Macie has detected you are writing PII to log data exported from CloudWatch, as it tells you the log file it's in, but it does not tell you where in that log file the event was found.
Macie in practice
Currently, if you want to use Macie in practice to (for example) scan your application log files, you will need to do some up-front steps first.
Assuming AWS Cloudwatch
This guide assumes your application logs to AWS CloudWatch, if not then use the sections that are appropriate.
Our goal here is to get our log data into an S3 bucket in AWS us-east-1 region so Macie can scan it. To do this, we first need to get our data out of AWS CloudWatch and in to an S3 bucket in the region that our CloudWatch logs are stored.
Amazon has written a great guide on how to export log data, so go ahead and follow that.
Once you have your log data stored in an S3 bucket, you will need to set up a new bucket in us-east-1 (assuming that your application is not already hosted there) and copy the data in your export bucket into the Macie bucket.
This can be done by an AWS CLI command as simple as:
aws s3 sync s3://original-export-bucket s3://destination-macie-bucket --source-region=source-region-here --region=us-east-1
Once you have performed these actions, you can log into the AWS Macie console, select "Settings", choose the correct AWS account for the data ingest (to which the Macie service role has access) and select your created Macie bucket. Macie will then (usually very quickly) scan your data, and present the findings to you on the dashboard.