We must create the DB entry for a runner with a JIT config included. Adding it later
via an update runs the risk of having the consolidate loop pick up the incomplete instance.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
When using JIT runners, we register the runner on GitHub before we get
a chance to spin up the instance in the provider. In such cases, we end
up with a runner in "offline" state while we're creating the actual resource
that will embody the runner. This change will give runners a chance to come
online before garm tries to clean them up.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
We need to abstract away the tools struct and not have garm-provider-common
depend on go-github just for that one struct. It makes it hard to update
go-github without updating garm-provider-common first and then all the rest
of the providers.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Thic change adds a metadata endpoint that returns a list of root CA
certificates a runner must install in order to be able to validate all
relevant API endpoints it may require. This includes any GHES API that
runs on a self signed certificate.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Updates the garm-provider-common and go-github packages.
* Update sqlToParamsInstance to return an error when unmarshaling
This change is needed to pull in the new Seal/Unseal functions in common.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Ensure that there is a foreign key constraint between runners and jobs.
Once a runner is associated with a job, we want the job to be removed along
with the runner.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Instead of asking for all the GARM instances from the API, and then
filtering them by `poolID`, we can ask the API to return only the
instances that are in the `poolID` pool.
Therefore, we only need to count the instances that are running and idle.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
If we fail to get the tools for one pool, garm fails to start due to pool
manager startup timeout. Launch the initial tools update function as a
goroutine and return from Start(). If it fails, it will retry, and we won't
block garm from starting.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The variable `runningIdleCount` would get incremented for instances
on every pool, instead of only for the pool we are interested in.
This change fixes this.
Also, adjust the logging message when error occurs in this timeout
exceeded scenario.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
* General cleanup of the integration tests Golang code. Move the
`e2e.go` codebase into its own package and separate files.
* Reduce the overall log spam from the integration tests output.
* Add final GitHub workflow step that stops GARM server, and does the
GitHub cleanup of any orphaned resources.
* Add `TODO` to implement cleanup of the orphaned GitHub webhooks.
This is useful, if the uninstall of the webhooks failed.
* Add `TODO` for extra missing checks on the GitHub webhooks
install / uninstall logic.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
It appears that in some cases, generating passwords by reading /dev/urandom
and piping to tr, fails. Use apg.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>