Commit graph

54 commits

Author SHA1 Message Date
Gabriel Adrian Samfira
d09f12dfd8 Add force delete runner
This branch adds the ability to forcefully remove a runner from GARM.

When the operator wishes to manually remove a runner, the workflow is as
follows:

* Check that the runner exists in GitHub. If it does, attempt to
  remove it. An error here indicates that the runner may be processing
  a job. In this case, we don't continue and the operator gets immediate
  feedback from the API.
* Mark the runner in the database as pending_delete
* Allow the consolidate loop to reap it from the provider and remove it
  from the database.

Removing the instance from the provider is async. If the provider errs out,
GARM will keep trying to remove it in perpetuity until the provider succedes.

In situations where the provider is misconfigured, this will never happen, leaving
the instance in a permanent state of pending_delete.

A provider may fail for various reasons. Either credentials have expired, the
API endpoint has changed, the provider is misconfigured or the operator may just
have removed it from the config before cleaning up the runners. While some cases
are recoverable, some are not. We cannot have a situation in which we cannot clean
resources in garm because of a misconfiguration.

This change adds the pending_force_delete instance status. Instances marked with
this status, will be removed from GARM even if the provider reports an error.

The GARM cli has been modified to give new meaning to the --force-remove-runner
option. This option in the CLI is no longer mandatory. Instead, setting it will mark
the runner with the new pending_force_delete status. Omitting it will mark the runner
with the old status of pending_delete.

Fixes: #160

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-10-12 06:15:36 +00:00
Gabriel Adrian Samfira
e53c271337 Add metadata URLs
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-09-24 13:50:21 +00:00
Gabriel Adrian Samfira
1268507ce6 Add jit config routes
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-09-24 13:50:20 +00:00
Gabriel Adrian Samfira
a26907fb91 Add root CA bundle metadata URL
Thic change adds a metadata endpoint that returns a list of root CA
certificates a runner must install in order to be able to validate all
relevant API endpoints it may require. This includes any GHES API that
runs on a self signed certificate.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-28 09:44:18 +00:00
Gabriel Adrian Samfira
1c0ff85a0d
Add flag to toggle webhook management
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:02 +03:00
Gabriel Adrian Samfira
c00048e128
Add optional keepWebhook flag when removing an entity
The user can opt to not delete the webhook (if installed) when removing
the entity from garm. Garm will only ever try to remove a webhook that
exactly matches the URL that is composed of the base webhook URL configured
in the config.toml file and the unique controller ID that is generated
when the controller is first installed. It should be safe to remove the
webhook when the entity is removed.

Of course, this behavior can be disabled.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:02 +03:00
Gabriel Adrian Samfira
bb6ee9c668
Add keepWebhook flag when deleting entities
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
779afe980e
Add webhook show, return info and some fixes
* Added a webhook show command. This gives us info about the webhook and
    if it is installed.
  * Return webhook info when installing the webhook
  * Small typo fixes.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
dbd41f518d
Add CLI webhook enablement
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
7ce3f007b0
Add functions to (un)install webhooks for orgs and repos
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
f2796f1d5a
Add admin required middleware and webhook endpoint
* Add a new middleware that tests for admin access
  * Add a new controller ID suffixed webhook endpoint. This will be used
    to accept webhook events on a webhook URL that is suffixed with our own
    controller ID.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
99539edde7 Add controller info
This change adds a new controller info endpoint and associated client and
CLI command. The controller info endpoint returns information about controller
status and configuration.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-12 22:47:50 +00:00
Gabriel Adrian Samfira
e775c9c11d Move most of util package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-22 22:39:17 +00:00
Gabriel Adrian Samfira
ed651bb7d0 Move errors to external package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-22 22:26:47 +00:00
Mihaela Balutoiu
802c7527fe Add more swagger annotations for apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-19 18:06:13 +03:00
Mihaela Balutoiu
9ba3ae487b Update the swagger annotations for the apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-18 19:27:21 +03:00
Mihaela Balutoiu
f13d19b4ec Add more functionality to swagger client library
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-18 15:19:53 +03:00
Mihaela Balutoiu
98e415bf11 Add more swagger annotations to apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-17 12:00:51 +03:00
Ionut Balutoiu
ca878507b5 Add more functionality to swagger client library
* Update `go:generate` annotations to use stable swagger tag instead
  of relying on the `swagger` CLI tool already installed.
* Rename the following route IDs from repositories:
    * `Create` -> `CreateRepo`
    * `List` -> `ListRepos`
    * `Get` -> `GetRepo`
    * `Delete` -> `DeleteRepo`
  The swagger CLI spec validation will fail if the route IDs are not unique.
* Fully implement the all the API calls to:
    * `/api/v1/repositories`
    * `/api/v1/instances`

Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
2023-07-05 13:47:58 +03:00
Gabriel Adrian Samfira
86ed06d6ff Rename UpdateRepositoryParams to UpdateEntityParams
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-05 00:00:24 +00:00
Gabriel Adrian Samfira
1287a93cf2 Add job list to API and CLI
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Ionut Balutoiu
d122f293cf Add swagger annotations to apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-06-30 19:03:57 +03:00
Gabriel Adrian Samfira
829db87f15
Rename module
This change renames the module from "garm" to "github.com/cloudbase/garm".

This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-12 16:01:49 +02:00
Gabriel Adrian Samfira
8d17498ab8
Remove caches, retry fetching the hostname
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-29 16:03:20 +02:00
Gabriel Adrian Samfira
8f56f51598 Move some code around
Move the metrics code into its own package.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-27 14:57:25 +00:00
Michael Kuhnt
6cd18ff1fd
improve access to controller info 2023-01-26 22:13:18 +01:00
Michael Kuhnt
6a032bfaa2
metrics: fix review findings 2023-01-26 15:46:27 +01:00
Michael Kuhnt
ee659f509f
feat: add prometheus metrics & endpoint 2023-01-26 14:15:16 +01:00
Gabriel Adrian Samfira
e93b6d73e5
Sanitize log entries
While most of these log entries come from either github or our own
database, it's still a good idea to sanitize them.
2023-01-23 18:01:46 +02:00
Gabriel Adrian Samfira
687c48127b
Fix some linting issues and add verify-vendor
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-20 21:55:52 +02:00
Gabriel Adrian Samfira
a078645ab2
Add token endpoint
This change adds a github registration endpoint that instances can use
to fetch a github registration token.

This change also invalidates disables access to an instance to the token
and status updates endpoints once the instance transitions from
"pending" or "installing" to any other state.
2022-12-01 18:00:22 +02:00
Gabriel Adrian Samfira
3e3b91ee59
Add enterprise support to garm-cli
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 17:14:03 +03:00
Gabriel Adrian Samfira
296333412a
Add enterprise support
This change adds enterprise support throughout garm.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 17:14:03 +03:00
Gabriel Adrian Samfira
a7f151e2d2
Add log streamer
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 11:13:42 +03:00
Michael Kuhnt
5bfecbaa30 feat: webhook does not return error on 'instance not found'.
This case is probably caused by a webhook event that was meant for
another runner controller / manager. No need to report this as
error, we can simply ignore this and avoid noise in the logs.
2022-10-07 11:18:25 +02:00
Gabriel Adrian Samfira
afb1d31394 Slight cleanup
* added interface for the github client. This will help mocking it
out for testing.
  * removed some unused code
  * moved some code around

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-07-07 16:48:00 +00:00
Gabriel Adrian Samfira
15a1308441 Add timeout functionality for pool runner bootstrap
Pools can now define a bootstrap timeout for runners. The timeout can
be defined per pool and indicates the amount of time after which a runner
is considered defunct and removed.

If a runner doesn't join github in the configured amount of time, and it
receives no updates indicating that it is installing the runner via instance
status updates, it is considered defunct.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-06-29 23:44:03 +00:00
Gabriel Adrian Samfira
5390efbaab Add manual runner removal
Runners can now be manually removed using the CLI. Some restrictions apply:

  * A runner must be idle in github. Github will not allow us to remove a runner
that is running a workflow.
  * The runner status must be "running"

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-06-29 16:23:01 +00:00
Gabriel Adrian Samfira
dc04bca95c Retry failed runners
* retry adding runners for up to 5 times if they fail.
  * various fixes
2022-05-10 12:28:39 +00:00
Gabriel Adrian Samfira
5e0a64f909 Add license headers 2022-05-05 13:25:50 +00:00
Gabriel Adrian Samfira
d9c65872e8 Added more CLI commands and API endpoints 2022-05-05 13:07:06 +00:00
Gabriel Adrian Samfira
f130798f41 Added org pool command
* added new command
  * fixed a bunch of bugs in orgs
2022-05-04 21:57:08 +00:00
Gabriel Adrian Samfira
095b43ffb4 Add organizations 2022-05-04 16:27:24 +00:00
Gabriel Adrian Samfira
1dda4a835c Rename project to garm
Project renamed to garm (Github Actions Runner Manager)
2022-05-04 11:44:10 +00:00
Gabriel Adrian Samfira
1bb7f51f56 Format error messages 2022-05-03 20:49:39 +00:00
Gabriel Adrian Samfira
2bd128af13 Runners now send status messages 2022-05-03 19:49:14 +00:00
Gabriel Adrian Samfira
8ceafff09b Add more CLI commands 2022-05-03 12:40:59 +00:00
Gabriel Adrian Samfira
475d424f32 Add a basic CLI 2022-05-02 17:55:29 +00:00
Gabriel Adrian Samfira
7ec937a138 Main webhook cases implemented
Queued, completed and in_progress workflow_job messages are now
acted upon.
2022-04-29 23:43:37 +00:00
Gabriel Adrian Samfira
a78ad539fe Auto create runners for pools 2022-04-29 16:08:31 +00:00