Github will remove inactive scale sets after 7 days. This change
ensures the scale set exists in github before spinning up the listener.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Add swagger annotations to models to allow generating a full swagger
definition. This will help generate clients in other languages if needed.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change renames a lot of variables, types and functions to be more
generic. The goal is to allow GARM to add more forges in the future.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The JSON that gets returned by the API is filled with empty values
which serve no purpose. Adding omitempty will skip empty values.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
GARM has a backoff interval when consuming queued jobs. This backoff
is intended to allow any potential idle runners to pick up a job before
GARM attempts to spin up a new one. This change allows users to set a
custom backoff interval or disable it altogether by setting it to 0.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change moves the callback_url, metadata_url and webhooks_url from
the config to the database. The goal is to move as much as possible from
the config to the DB, in preparation for a potential refactor that will
allow GARM to scale out. This would allow multiple nodes to share a single
source of truth.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Add database models that deal with github credentials. This change
adds models for github endpoints (github.com, GHES, etc). This change
also adds code to migrate config credntials to the DB.
Tests need to be fixed and new tests need to be written. This will come
in a later commit.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds the ability to specify the pool balancing strategy to
use when processing queued jobs. Before this change, GARM would round-robin
through all pools that matched the set of tags requested by queued jobs.
When round-robin (default) is used for an entity (repo, org or enterprise)
and you have 2 pools defined for that entity with a common set of tags that
match 10 jobs (for example), then those jobs would trigger the creation of
a new runner in each of the two pools in turn. Job 1 would go to pool 1,
job 2 would go to pool 2, job 3 to pool 1, job 4 to pool 2 and so on.
When "stack" is used, those same 10 jobs would trigger the creation of a
new runner in the pool with the highest priority, every time.
In both cases, if a pool is full, the next one would be tried automatically.
For the stack case, this would mean that if pool 2 had a priority of 10 and
pool 1 would have a priority of 5, pool 2 would be saturated first, then
pool 1.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Allow runners to update their own system information. Runners can now send
back os_name, os_version and agent_id back as part of a POST to
CALLBACK_URL/system-info/.
The goal is to get better info in regard to the actual OS that's running
and to move the agent_id from the status updates to the system-info callback.
The status updates should be used only to send back info about the status of
the installation process.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
We must create the DB entry for a runner with a JIT config included. Adding it later
via an update runs the risk of having the consolidate loop pick up the incomplete instance.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Break the lock on a job if it's still queued and the runner that it
triggered was assigned to another job. This may cause leftover runners
to be created, but we scale those down in ~3 minutes.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The params package should not depend on config. The params packages
should be consumable by external applications that wish to interact with
garm, and it makes no sense to pull in the config package just for some
constants. As such, the following changes have been made:
* Moved some types from config to params
* Moved defaults in a new leaf package called appdefaults
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change renames the module from "garm" to "github.com/cloudbase/garm".
This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Extra specs is an opaque valid JSON that can be set on a pool and which
will be passed along to the provider as part of instance bootstrap params.
This field is meant to allow operators to send extra configuration values
to external or built-in providers. The extra specs is not interpreted or
useful in any way to garm itself, but it may be useful to the provider
which interacts with the IaaS.
The extra specs are not meant to be used for secrets. Adding sensitive
information to this field is highly discouraged. This field is meant as a
means to add fine tuning knobs to the providers, on a per pool basis.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
There are several fields that are common among some of the data
structures in garm. The RunnerPrefix is just one of them. Perhaps we
should move some of the rest in a common type and embed that into the
types that share those fields.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Define a metadata subrouter and move the token endpoint there. We may
end up needing multiple endpoints for various purposes in the future.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds a github registration endpoint that instances can use
to fetch a github registration token.
This change also invalidates disables access to an instance to the token
and status updates endpoints once the instance transitions from
"pending" or "installing" to any other state.
In order to allow mocking for some of the `runner` functions, we created a
separate interface (called `PoolManagerController`) with `Create`, `Get`,
`Delete` operations for the `organization` / `repository` pool managers.
Furthermore, a new runner struct (`poolManagerCtrl`) implements this new
interface. The existing code is refactored to use the `poolManagerCtrl`
whenever the pool managers for `org` / `repo` are handled.
This allows more unit testing for the runner functions since `poolManagerCtrl`
field can be mocked now.
Besides this, there are some typos fixed as well.
Pools can now define a bootstrap timeout for runners. The timeout can
be defined per pool and indicates the amount of time after which a runner
is considered defunct and removed.
If a runner doesn't join github in the configured amount of time, and it
receives no updates indicating that it is installing the runner via instance
status updates, it is considered defunct.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Runners can now be manually removed using the CLI. Some restrictions apply:
* A runner must be idle in github. Github will not allow us to remove a runner
that is running a workflow.
* The runner status must be "running"
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>