This article provides an overview of the auto-scaling feature and managed services architecture. The article also describes the steps to overcome certain challenges faced after scale-down of additional VMs as part of auto-scaling.
Auto-scaling is an essential feature of cloud computing, which automatically adds or removes compute resources based on actual usage. However, after you scale-down additional VMs as part of auto-scaling, you might face a few challenges during de-registration of VMs from the monitoring dashboard such as Nagios.
This article provides the steps to overcome these issues related VM de-registration after auto scaling down. The steps provided here are in the context of VMs deployed on Amazon Web Services (AWS) with managed services.
To get a better understanding of the challenges, you must first familiarize yourself with the auto-scaling feature and managed services architecture. A brief overview is provided in the following sections.
Some of the terms used in this article are explained here.
|CloudWatch||Monitors the AWS workloads and collects relevant statistics that can be used in conjunction with the VM metrics to initiate the deployment or removal of a VM.|
|Custom AMI||Custom Amazon Machine Image (AMI) improves provisioning times when instances are launched in an environment and you need to install a lot of software that are not included in the standard AMIs.|
|Lambda||Used for several predefined services including adding network interfaces (ENIs) on newly deployed VMs, monitoring VM-Series traffic metrics, and communicating with Amazon CloudWatch.|
Auto-scaling ensures that you have the correct number of instances available to handle the application load. The collections of these instances are called auto-scaling groups.
You can specify the following characteristics for a group:
- Minimum, maximum and desired number of instances in each auto-scaling group:
Auto-scaling ensures that the number of instances in the group always fits within the specified minimum and maximum range.
- Group capacity: Auto-scaling ensures that the group has the specified instances.
- Auto-scaling policies: Auto-scaling can launch or terminate instances as demand on the application differs.
The following diagram depicts an Auto Scaling group.
VM registration during auto-scaling
Let’s now take a look at the sequence of steps that involve VM registration during auto-scaling.
- Define an Amazon Machine Image (AMI) instance and create an Auto Scaling group to launch instances
- Use CloudWatch to monitor the servers and when certain configurable events occur, launch more instances based on the AMI template defined.
- Once new EC2 (Elastic Compute Cloud) instances are launched, they get bootstrapped with Chef server and managed services tools are installed and configured.
- After Chef bootstrap, the new server gets registered with the central tools and becomes part of the application cluster in auto-scaling group.
The following diagram illustrates the VM registration process during auto scaling.
Managed services architecture
A fixed number of instances run in an auto-scaling group when the load is stable and these instances are registered with managed services tools.
Automation tools such as Chef, Puppet, or Ansible are used to register these VMs on the managed services tools set.
The following diagram illustrates the initial architecture.
When the demand for the resources increases, the load balancer detects and enables auto-scaling to deploy additional VMs as per the policy defined in the auto-scaling group.
These newly deployed VMs are registered like the VMs initially deployed using automation tools.
The following diagram illustrates the auto scale up architecture.
Once the demand is met and additional resources are no longer needed, auto-scaling will terminate additional VMs (or reduce the number of instances).
The additional VMs terminated by auto-scaling are not de-registered from the managed services, as there is no means to run a script or automation tool at client end to de-register the deleted VMs.
The following diagram illustrates the auto scale down architecture.
The following challenges are faced after scale-down of additional VMs:
- There is no means to de-register the deleted VMs from managed services before it terminates.
- False alerts are raised if terminated VMs are not de-registered from the managed services.
The solution described here utilizes the monitoring and computing services of public cloud providers to de-register the VMs based on the termination event, for example, CloudWatch, Lambda, and SSM from AWS.
The following diagram illustrates the process for de-registering VMs.
Follow these steps to de-register the VMs:
- Set rules in CloudWatch for any termination to capture the instance details.
- Configure Lambda as a trigger that will capture the instance details.
- Use Lambda to also transfer the information to SSM.
- SSM agent will run the shell script on the Nagios Relay instance.
- Then, the Nagios relay server processes the files and de-registers the instance.
The following diagram illustrates the architecture when this solution is implemented.