Autoscaling is thrashing instances with AWS plugin
Summary
Steps to reproduce
- register new runner,
go run . register
, config asinstance
executor - set config.toml as shown below, with your own credentials
- create ASG in AWS with desired/min/max/ set to 0/0/20
- start runner
go run . run
- runner will begin thrashing, repeatedly increasing and decreasing instances, despite being idle with no jobs in queue
Actual behavior
Instances repeatedly increase and decrease.
Expected behavior
Instances should scale to desired capacity then remain there until job load increases.
Relevant logs and/or screenshots
job log
2022-11-15T13:00:31.191-0700 [INFO] decreasing instances: amount=1 group=aws/us-east-1/[REDACTED]
2022-11-15T13:00:31.628-0700 [INFO] instance update: group=aws/us-east-1/[REDACTED] id=i-[REDACTED] state=deleting
2022-11-15T13:00:32.344-0700 [INFO] increasing instances: amount=1 group=aws/us-east-1/[REDACTED]
2022-11-15T13:00:44.335-0700 [INFO] instance discovery: group=aws/us-east-1/[REDACTED] id=i-[REDACTED] state=creating cause=requested
2022-11-15T13:00:45.443-0700 [INFO] instance update: group=aws/us-east-1/[REDACTED] id=i-[REDACTED] state=running
2022-11-15T13:00:47.576-0700 [INFO] ready: instance=i-[REDACTED] took=2.131872917s
Checking for jobs...nothing runner=[REDACTED]
2022-11-15T13:00:48.205-0700 [INFO] decreasing instances: amount=1 group=aws/us-east-1/[REDACTED]
2022-11-15T13:00:48.619-0700 [INFO] instance update: group=aws/us-east-1/[REDACTED] id=i-[REDACTED] state=deleting
2022-11-15T13:00:49.319-0700 [INFO] increasing instances: amount=1 group=aws/us-east-1/[REDACTED]
Environment description
config.toml contents
concurrent = 15
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "taskscaler 1"
url = "https://gitlab.com/"
id = [REDACTED]
token = "[REDACTED]"
executor = "instance"
shell = "bash"
[runners.autoscaler]
capacity_per_instance = 1
max_use_count = 1
max_instances = 20
plugin = "fleeting-plugin-aws"
[runners.autoscaler.plugin_config]
credentials_file = "[REDACTED]"
name = "[REDACTED]"
project = "[REDACTED]"
region = "us-east-1"
[runners.autoscaler.connector_config]
username = "ubuntu"
[[runners.autoscaler.policy]]
idle_count = 5
idle_time = 0
scale_factor = 0.0
scale_factor_limit = 0
Used GitLab Runner version
Possible fixes
It seems the behavior may have began as a result of this fix.
Edited by Davis Bickford