Add functionality to rotate the log file when `SIGHUP` signal is received.
Also, a doc is added with few details about logging.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
Lock operations per instance name. This should avoid go routines trying
to update the same instance when operations may be slow.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This commit adds:
* more granular loops for various operations
* update go-github to latest version
* skip trying to fetch runner info for canceled or skipped jobs
* loops use waitgroups to signal exit
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This commit:
* swaps WaitGroups with errgroups
* wraps errgroup.Wait() in a select to prevent situations in which an
operation takes a long time and prevents garm from being restarted.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Add a build target to the Makefile that allows building a non static
version if both garm and garm-cli. This is useful when you just need to
build the binaries locally and make sure the version is properly set.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Convert the templates.go file to contain tabs instead of spaces
consistently.
Add some status messages when installing.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* When a runner fails to set up the github agent, we reap it after the
pool timeout is reached.
* add a retry in the userdata when configuring the runner agent
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Providers may return only 3 possible statuses:
* InstanceRunning
* InstanceError
* InstanceStopped
Every other status is reserved for the controller to set. Provider
responses will be split from the instance response in a future commit.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Use an errgroup to wait for all instance deletion operations before
returning. Log any failure.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* cleanupOrphanedGithubRunners() now uses errgroup to parallelize and
report errors when removing runners from the provider.
* retryFailedInstancesForOnePool() now uses errgroup
* Removed some setPoolRunningState which should be treated in the loop
where those errors eventually bubble up and can be handled.
* Added a number of timeouts in the LXD provider for delete and list
instances. This provider should be converted into an external
provider.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>