The pay-as-you-go model of the Cloud is one of the major advantages of using a public cloud like AWS. Especially in cases where you can leverage it extensively. These are usually cases where you would require a certain computing power only for some specific time versus the entire day or month.
We recently joined our colleagues from the HOB for a project where a central network management software system was needed to operate a fleet of several hundred network- and security devices distributed all over Germany.
Usually, you would buy such an appliance together with a license from the vendor and put it somewhere in a traditional IT rack. Not only would that have us running a device for 24 hours every day, but also mean buying a license on a yearly basis.
When we heard that one of the project requirements is a management software that is only needed for a few hours on a few days a month, we immediately figured that this would be a huge waste of space, cooling, power, and money. What if we could bring both worlds together and run a traditional network management system like this in the AWS cloud and use the pay-as-go model?
When we looked at the different options of running such an appliance in the cloud, we figured out that the vendor provides his software also via the AWS Marketplace. This made it not only super easy to deploy but also does allow us to purchase the license on a per-hour basis, resulting also in a huge potential for cost savings.
The only thing we need to ensure was that the instance used for running the software only gets started when needed and stopped when it’s no longer needed.
We all know that phrase “Just stop the instance after the work was done.” In general, this is a great plan. But as we also know we are all humans and from time to time we get distracted, interrupted, or simply forget things. So what else could be done to help our colleagues avoid spending money when it’s not absolutely necessary?
So far we have implemented two technical solutions.
The first one is based on the idea that if the instance is idle for a while, we can take that as an indication and assume it’s no longer in use. We define a metric to determine the utilization, create an AWS Cloudwatch alert and configure an email notification. A diagram of the setup is shown in the picture below:
The actual implementation of the alarm is done via the metric ‘NetworkOut’. This is monitored on average over the period of 5 minutes. As soon as 24 of 24 data points (e.g. 2 hours) fall below the previously defined idle value, the alarm is triggered.
So whenever the instance is left running but not used, the Cloudwatch alert gets triggered and sends a friendly email reminder via Amazon SNS to the team.
But what to do if the manual shutdown is forgotten and the additional email notification gets overlooked? We wanted another backup with a slightly more aggressive approach. One that would shut down the instance every night, no matter what.
This is a feature-full implementation created and maintained by AWS itself to control your AWS resources costs for EC2 and RDS instances. It’s well documented and can easily be deployed via CloudFormation. The AWS Instance Scheduler leverages tags and Lambda functions to automatically stop or restart instances across accounts and regions based on a schedule you can define. After the deployment you would find the following architecture inside your AWS account:
In general, we would recommend that solution for any bigger environment. It’s super easy from a user’s perspective and once set up, all you have to do is to tag your resources accordingly and everything else is handled fully automatically.
But for our environment with only a single instance, this AWS Solution would be a bit too complex and also too expensive. As soon as you are out of your Lambda free tier it adds roughly $10 to your monthly AWS bill.
So we started looking for something simpler.
Amazon EventBridge is an often overlooked and underestimated service. The service and its API were first introduced in 2016 as part of CloudWatch under the name Cloudwatch Events. In July 2019 AWS released it as a standalone service that offers advanced features and is now used as the preferred way for managing events.
So what’s Amazon Eventbridge and how can it help us here? EventBridge is a fully serverless event bus well suited to build event-driven architectures that are loosely coupled and distributed. An event is a small blob of JSON which is sent by a variety of sources to either the EventBridge default bus or a custom bus you have created. When an event arrives at the bus, EventBridge evaluates the event against a set of rules that are associated with the bus. When a rule matches an incoming event it routes the event to one or more a defined target as a real-time stream. Almost all AWS Services can act as a source and have a built-in integration with EventBridge. For example EC2 Auto Scaling events, CodeBuild or CodeDeploy events, EC2 Spot interruption events, or ECS container instance state change events.
CodeBuild or CodeDeploy events, EC2 Spot interruption events, or ECS container instance state change events.
Rules match either on an event pattern or run based on scheduled events and get triggered at a certain time interval. Scheduled events can be defined based on a cron-style expression or a fixed rate of either every minute, hour, or day.
For our use case, we created a simple cron-style scheduled event that would run every evening at 17:00 UTC. This event is put into the default EventBus and the rule associated with it will send an EC2 StopInstances API call to the instance based on the instance-id.
When you use a public cloud like AWS with its pay-as-you-go model, spending some time thinking about what you need and implement measurements to avoid waste, can save you a lot of money. Running compute resources only when needed is one major aspect of it.
In this blog post, we discussed some easy-to-implement technical solutions which help to support the idea of taking control of your cloud expenses and ensure that your EC2 instance does not run much longer than necessary.
The result can be seen in this cost overview. Even while started regularly the instance is never running for more than a few hours.