Skip to content

ECS autoscaling

Created: 2019-01-22 10:29:44 -0800 Modified: 2019-09-23 10:43:21 -0700

  • There are three general ways to scale (reference):
    • Target tracking scaling - scale based on a CloudWatch metric, e.g. average CPU utilization.
      • For example, if you want an average CPU utilization of 70%, then CloudWatch will keep an eye on all of your services/tasks and spin up or wind down based on that metric.
    • Step scaling - scale based on scaling adjustments that vary based on the size of the alarm breach
    • Scheduled scaling - scale based on date/time
  • To find this information in the console, go to ECS → Clusters → <pick a cluster> → <pick a service> → Update → on page 3, you’ll see “Set Auto Scaling (optional)“.
  • “Scaling in” is what AWS calls removing instances, and “scaling out” is adding instances.
  • You can combine multiple target tracking scaling policies as long as you use different metrics. You’ll scale based on the largest metric.
  • Auto-scaling up happens rapidly. Auto-scaling down happens more slowly (reference).
  • Amazon suggests scaling based on metrics with a one-minute frequency to get the fastest response times (reference).
  • At least for step scaling, the policy will not respect any manually set desired_count for your services (reference).
  • Step scaling requires IAM policies to be setup (reference).
  • When setting up a target tracking policy, you can choose the scale-in and scale-out cooldown periods. During these times, you will not end up scaling in or out again. I believe the scale-out time should be set to however long it takes for a container to be up and running in your system and accepting traffic plus about 10 seconds so that the load can actually equalize. (reference)
  • You can create an autoscaling policy for a service after the service’s creation. Also, if you have a policy already, then you can update the parameters.

As of 1/22/2019, game servers perform 50 trial games on their own to prime V8 optimizations. This takes about 10 seconds. Because it’s not for too long, I don’t think it should greatly affect the average CPU utilization.

However, I don’t know how long the servers need to be online for before they’re able to scale up; what would happen if I had 100% CPU utilization immediately? I assume I would need to set the desiredcount in Terraform to something higher than 1 and wait for Fargate to scale _down rather than up.