This change adds a new "generation" field to pools, scalesets and
runners. The generation field is inherited by runners from scale sets
or pools at the time of creation.
The generation field on scalesets and pools is incremented when the
pool or scale set is updated in a way that might influence how runners
are created (flavor, image, specs, runner groups, etc).
Using this new field, we can determine if existing runners have diverged
from the settings of the pool/scale set that spawned them.
In the CLI we now have a new set of commands available for both
pools and scalesets that lists runners, with an optional --outdated
flag and a new "rotate" flag that removes all idle runners. Optionally
the --outdated flag can be passed to the rotate command to only remove
outdated runners.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds the ability to manage garm-agent tools downloads. Users
can:
* Set an upstream releases page (github releases api)
* Enable sync from upstream. In this case, GARM will automatically download
garm-agent tools from the releases page and save them in the internal
object store
* Manually upload tools. Manually uploaded tools for an OS/arch combination
will never be overwritten by auto-sync. Usrs will need to delete manually
uploaded tools to enable sync for that os/arch release.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds a new "agent mode" to GARM. The agent enables GARM to
set up a persistent websocket connection between the garm server and the
runners it spawns. The goal is to be able to easier keep track of state,
even without subsequent webhooks from the forge.
The Agent will report via websockets when the runner is actually online,
when it started a job and when it finished a job.
Additionally, the agent allows us to enable optional remote shell between
the user and any runner that is spun up using agent mode. The remote shell
is multiplexed over the same persistent websocket connection the agent
sets up with the server (the agent never listens on a port).
Enablement has also been done in the web UI for this functionality.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds the API endpoints, the CLI commands and the web UI elements
needed to manage objects in GARMs internal storage.
This storage system is meant to be used to distribute the garm-agent and as a
single source of truth for provider binaries, when we will add the ability for GARM
to scale out.
Potentially, we can also use this in air gapped systems to distribute the runner binaries
for forges that don't have their own internal storage system (like GHES).
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Add template api endpoints
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Added template bypass
Pools and scale sets will automatically migrate to the new template
system for runner install scripts. If a pool or a scale set cannot be
migrate, it is left alone. It is expected that users set a runner install
template manually for scenarios we don't yet have a template for (windows
on gitea for example).
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Integrate templates with pool create/update
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Add webapp integration with templates
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Add unit tests
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Populate all relevant context fields
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Update dependencies
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Fix lint
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Validate uint
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Add CLI template management
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Some editor improvements and bugfixes
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Fix scale set return values post create
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Fix template websocket events filter
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
---------
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds a single page application front-end to GARM. It uses
a generated REST client, built from the swagger definitions, the websocket
interface for live updates of entities and eager loading of everything
except runners, as users may have many runners and we don't want to load
hundreds of runners in memory.
Proper pagination should be implemented in the API, in future commits,
to avoid loading lots of elements for no reason.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds the ability to filter the list of entities returned
by the API by entity owner, name or endpoint, depending on the entity
type.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* We were passing the wrong type to GORM for events
* We now expose entity events in the API and CLI
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change renames a lot of variables, types and functions to be more
generic. The goal is to allow GARM to add more forges in the future.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change moves the callback_url, metadata_url and webhooks_url from
the config to the database. The goal is to move as much as possible from
the config to the DB, in preparation for a potential refactor that will
allow GARM to scale out. This would allow multiple nodes to share a single
source of truth.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Add database models that deal with github credentials. This change
adds models for github endpoints (github.com, GHES, etc). This change
also adds code to migrate config credntials to the DB.
Tests need to be fixed and new tests need to be written. This will come
in a later commit.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Change instance DB functions from querying by ID to querying by name. Names
are unique in GARM, so we might as well use the name instead of the ID and
spare ourselves the extra query to get the ID when a qorkflow comes in.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
We'll use GithubEntityType throughout the codebase to determine the
type of operation that is about to take place, so this won't belimited
to determining only pool type. We'll also use this to dedupe the label
scope as well.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds the ability to specify the pool balancing strategy to
use when processing queued jobs. Before this change, GARM would round-robin
through all pools that matched the set of tags requested by queued jobs.
When round-robin (default) is used for an entity (repo, org or enterprise)
and you have 2 pools defined for that entity with a common set of tags that
match 10 jobs (for example), then those jobs would trigger the creation of
a new runner in each of the two pools in turn. Job 1 would go to pool 1,
job 2 would go to pool 2, job 3 to pool 1, job 4 to pool 2 and so on.
When "stack" is used, those same 10 jobs would trigger the creation of a
new runner in the pool with the highest priority, every time.
In both cases, if a pool is full, the next one would be tried automatically.
For the stack case, this would mean that if pool 2 had a priority of 10 and
pool 1 would have a priority of 5, pool 2 would be saturated first, then
pool 1.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change allows users to bypass GitHub Unauthorized errors when removing
github runners. This means that removing runners will now be possible even
if the pool manager is stopped.
There is a new flag added to the runner rm command and to the API that
tells GARM to bypass pool being stopped and any 401 error returned by
GitHub.
This means you will be able to remove the runners from garm and your
provider, but will mean that the runner will still exist in github as
"offline" if the credentials are not updated or the runner manually removed.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This branch adds the ability to forcefully remove a runner from GARM.
When the operator wishes to manually remove a runner, the workflow is as
follows:
* Check that the runner exists in GitHub. If it does, attempt to
remove it. An error here indicates that the runner may be processing
a job. In this case, we don't continue and the operator gets immediate
feedback from the API.
* Mark the runner in the database as pending_delete
* Allow the consolidate loop to reap it from the provider and remove it
from the database.
Removing the instance from the provider is async. If the provider errs out,
GARM will keep trying to remove it in perpetuity until the provider succedes.
In situations where the provider is misconfigured, this will never happen, leaving
the instance in a permanent state of pending_delete.
A provider may fail for various reasons. Either credentials have expired, the
API endpoint has changed, the provider is misconfigured or the operator may just
have removed it from the config before cleaning up the runners. While some cases
are recoverable, some are not. We cannot have a situation in which we cannot clean
resources in garm because of a misconfiguration.
This change adds the pending_force_delete instance status. Instances marked with
this status, will be removed from GARM even if the provider reports an error.
The GARM cli has been modified to give new meaning to the --force-remove-runner
option. This option in the CLI is no longer mandatory. Instead, setting it will mark
the runner with the new pending_force_delete status. Omitting it will mark the runner
with the old status of pending_delete.
Fixes: #160
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change renames the module from "garm" to "github.com/cloudbase/garm".
This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>