Since I have been using AWS for quite some time (since 2008) I have witnessed benefited from many of the innovations AWS has rolled out over the years. In this post I want to discuss a couple of specific values with VPC Endpoints (for S3 specifically – technically, Gateway endpoints for Amazon S3), including leveraging them for considerable cost savings.
To set the stage a little, I have been running one particular workload on AWS for a buddy of mine for over 12 years now. I use this as a bit of a test bed for various services, often finding ways to run things better, more efficiently, and reducing costs (both AWS costs, but labor as well – hey, I’m a little bit lazy and want to find quicker and easier ways to do things….).
For this workload we perform backups in a few different ways. First, we run automated daily volume snapshots of volumes containing critical data. We also perform daily file-based backups to S3 on a nightly basis, again of the critical data. We do the file-based backups because it is easy, convenient, and cost effective. But, it also provides a way that if a user overwrites a file, or a file goes “missing” (yes, users delete or lose track of files periodically) and rather than trying to figure out what a user may have done it is far easier to restore a particular file from this backup.
Egress Traffic from File-based Backups
This file-based backup is where this post is going to focus. Although I had set up an S3 endpoint for this particular VPC quite sometime ago I stopped using it for some reason…. This means that any traffic to services like S3 would traverse my NAT. For years I had used a NAT instance – primarily because that was the initial and only way to direct traffic outbound from instances on a private subnet. Then a few years ago I started up a NAT Gateway, directed all outbound traffic to that and let it run. This worked great with no complaints. Well, there was one nagging issue with it….
That nagging issue is that I noticed my VPC costs had increased a bit, but I just ignored it for some months. Then, around the beginning of 2023 I dug a bit deeper into our VPC costs. In that digging I went back to the beginning of our usage on AWS for this account and broke down the cost of each service on a monthly basis, then built graphs for the services which interested me most. As you can see from graph 1 (below) my VPC costs more-or-less doubled during 2020-2021.
As you can see in the VPC costs graph my costs dropped considerably in 2022. This is because near the beginning of 2022 I updated my VPC route tables to route traffic to S3 through our S3 endpoint – therefore bypassing NAT. This all means that instead of egress traffic (in this case my file-based backups, which is by far the highest usage) going through NAT (and getting billed by AWS), that traffic is using the VPC S3 endpoint. And although this post is focused on cost savings there are other (security-related mainly) considerations to using an S3 endpoint.
Thanks to CloudWatch enhancements AWS now provides great detail about various services, like traffic exiting/entering your VPC through NAT (yes, NAT allows traffic to enter, assuming the request originated inside the VPC). Graph 2 shows that when I made the routing swith to use my S3 endpoint and not the NAT Gateway when connecting to S3 my egress bandwidth dropped. Considerably!
Admittedly this is a little off, which is because sometime over the past couple of years Amazon created a new cost category where they break out the “data transfer” costs separate from VPC costs. So, looking at those costs they are down considerably too.
Any which way I look at it we are benefiting considerably (primarily with reduced egress traffic through NAT, therefore lowering costs) by using an S3 Endpoint to direct traffic from our VPC to S3.