This branch adds the ability to forcefully remove a runner from GARM.
When the operator wishes to manually remove a runner, the workflow is as
follows:
* Check that the runner exists in GitHub. If it does, attempt to
remove it. An error here indicates that the runner may be processing
a job. In this case, we don't continue and the operator gets immediate
feedback from the API.
* Mark the runner in the database as pending_delete
* Allow the consolidate loop to reap it from the provider and remove it
from the database.
Removing the instance from the provider is async. If the provider errs out,
GARM will keep trying to remove it in perpetuity until the provider succedes.
In situations where the provider is misconfigured, this will never happen, leaving
the instance in a permanent state of pending_delete.
A provider may fail for various reasons. Either credentials have expired, the
API endpoint has changed, the provider is misconfigured or the operator may just
have removed it from the config before cleaning up the runners. While some cases
are recoverable, some are not. We cannot have a situation in which we cannot clean
resources in garm because of a misconfiguration.
This change adds the pending_force_delete instance status. Instances marked with
this status, will be removed from GARM even if the provider reports an error.
The GARM cli has been modified to give new meaning to the --force-remove-runner
option. This option in the CLI is no longer mandatory. Instead, setting it will mark
the runner with the new pending_force_delete status. Omitting it will mark the runner
with the old status of pending_delete.
Fixes: #160
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The user can opt to not delete the webhook (if installed) when removing
the entity from garm. Garm will only ever try to remove a webhook that
exactly matches the URL that is composed of the base webhook URL configured
in the config.toml file and the unique controller ID that is generated
when the controller is first installed. It should be safe to remove the
webhook when the entity is removed.
Of course, this behavior can be disabled.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* When removing a repo or org, we uninstall the webhook as well.
* Upgrade cobra command and mark "webhook-secret" and "random-webhook-secret"
as MarkFlagsOneRequired()
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Added a webhook show command. This gives us info about the webhook and
if it is installed.
* Return webhook info when installing the webhook
* Small typo fixes.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds a new controller info endpoint and associated client and
CLI command. The controller info endpoint returns information about controller
status and configuration.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* enable foreign key constraints on sqlite
* on delete cascade for addresses and status messages
* add debug server config option
* fix rr allocation
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Break the lock on a job if it's still queued and the runner that it
triggered was assigned to another job. This may cause leftover runners
to be created, but we scale those down in ~3 minutes.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Add functionality to rotate the log file when `SIGHUP` signal is received.
Also, a doc is added with few details about logging.
Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
The params package should not depend on config. The params packages
should be consumable by external applications that wish to interact with
garm, and it makes no sense to pull in the config package just for some
constants. As such, the following changes have been made:
* Moved some types from config to params
* Moved defaults in a new leaf package called appdefaults
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change renames the module from "garm" to "github.com/cloudbase/garm".
This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The TLS listener was not being set up correctly. The TLSConfig was changed
to include only cert and key. The cert now needs to be a full chain bundle
including intermediary CA certificates. The ca_cert config option was removed.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Extra specs is an opaque valid JSON that can be set on a pool and which
will be passed along to the provider as part of instance bootstrap params.
This field is meant to allow operators to send extra configuration values
to external or built-in providers. The extra specs is not interpreted or
useful in any way to garm itself, but it may be useful to the provider
which interacts with the IaaS.
The extra specs are not meant to be used for secrets. Adding sensitive
information to this field is highly discouraged. This field is meant as a
means to add fine tuning knobs to the providers, on a per pool basis.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
There are several fields that are common among some of the data
structures in garm. The RunnerPrefix is just one of them. Perhaps we
should move some of the rest in a common type and embed that into the
types that share those fields.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* Wait for http server graceful shutdown and for pool managers to
properly exit.
* Fix potential nil pointer dereference when checking response
code from github API.
Garm no longer fails on startup if a pool manager cannot be started. It
will attempt to start the pool manager in the background. If it fails
due to an unauthorized error, it will sleep for 3 hours. It is unlikely
it will work a second time if credentials are not updated in the config
and garm is restarted, so no point in getting rate limited.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The GitHub credentials section now allows setting some API endpoints
that point the github client and the runner setup script to the propper
URLs. This allows us to use garm with an on-prem github enterprise server.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>