Autoscale Machines
You have an app with services that’s configured to automatically start and stop Machines based on traffic demand. But the traffic to your app changes significantly during the day and you don’t want to keep a lot of stopped Machines during the period of low traffic.
This blueprint will guide you through the process of configuring the
fly-autoscaler
app in conjunction with
Fly Proxy autostop/autostart to
always keep a fixed number of Machines ready to be quickly started
by Fly Proxy.
Configure autostop/autostart
First, if you haven’t already done so, configure the app to allow Fly Proxy to automatically start and
stop or suspend Machines based on traffic demand. The autostop/autostart settings apply
per service, so you set them within the [[services]]
or [http_service]
sections of fly.toml
:
...
[[services]]
...
auto_stop_machines = "stop"
auto_start_machines = true
min_machines_running = 0
...
With these settings Fly Proxy will start an additional Machine if all the
running Machines are above their concurrency soft_limit
and stop running
Machines when the traffic decreases. You can set Machines to "suspend"
rather than
"stop"
, for even faster start-up, but with some limitations on the type of Machine.
In the next section you’ll configure
and deploy fly-autoscaler
to ensure that the app always has a spare stopped
Machine for Fly Proxy to start.
Configuring and deploying fly-autoscaler
fly-autoscaler
is a metrics-based autoscaler that scales an app’s Machines
based on any metric. You can configure it to ensure that there is always
additional Machine available for Fly Proxy to start if the traffic increases.
First, create a new Fly.io app that will run the autoscaler.
$ fly apps create my-autoscaler
Create a deploy token so that the autoscaler app has permissions to scale your target app up and down:
$ fly tokens create deploy -a my-target-app
$ fly secrets set -o my-autoscaler --stage FAS_API_TOKEN="FlyV1 ..."
Create a read-only token so that the autoscaler app has access to a Prometheus instance:
$ fly tokens create readonly -o my-org
$ fly secrets set -o my-autoscaler --stage FAS_PROMETHEUS_TOKEN="FlyV1 ..."
Configure your autoscaler fly.toml
like this:
app = "my-autoscaler"
[build]
image = "flyio/fly-autoscaler:0.3.1"
[env]
FAS_PROMETHEUS_ADDRESS = "https://api.fly.io/prometheus/my-org"
FAS_PROMETHEUS_METRIC_NAME = "running_machines"
FAS_PROMETHEUS_QUERY = "count(fly_instance_up{app='$APP_NAME'})"
FAS_APP_NAME = "my-target-app"
FAS_CREATED_MACHINE_COUNT = "min(running_machines + 1, 10)"
FAS_INITIAL_MACHINE_STATE = "stopped"
[metrics]
port = 9090
path = "/metrics"
With this configuration, the autoscaler will create a new stopped Machine as soon as all available Machines are running (but never more than 10), and will destroy extra stopped Machines if more than one Machine is stopped.
Make sure you are using autoscaler version 0.3.1 or newer for
FAS_INITIAL_MACHINE_STATE
configuration option to work.
And finally, deploy the autoscaler, using the --ha
option to deploy only one Machine:
$ fly deploy --ha=false