Commit graph

63 commits

Author SHA1 Message Date
Gabriel Adrian Samfira
0dd4f38691 Update go-github
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-12-18 16:20:44 +00:00
Gabriel Adrian Samfira
459906d97e Prevent abusing the GH API
On large deployments with many jobs, we cannot check each job that
we recorded in the DB against the GH API.

Before this change, if a job was updated more than 10 minutes ago,
garm would check against the GH api if that job still existed. While
this approach allowed us to maintain a consistent view over which jobs
still exist and which are stale, it had the potential of spamming the
GH API, leading to rate limiting.

This change uses the scale-down loop as an indicator for job staleness.

If a job remains in queued state in our DB, but has dissapeared from GH
or was serviced by another runner and we never got the hook (garm was down
or GH had an issue - happened in the past), then garm will spin up a new
runner for it. If that runner or any other runner is scaled down, we check
if we have jobs in the queue that should have matched that runner. If we did,
there is a high chance that the job no longer exists in GH and we can remove
the job from the queue.

Of course, there is a chance that GH is having issues and the job is never
pushed to the runner, but we can't really account for everything. In this case
I'd rather avoid rate limiting ourselves.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-12-15 22:41:50 +00:00
Gabriel Adrian Samfira
034cc47185 Add jitconfig model field
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-09-24 13:48:09 +00:00
Gabriel Adrian Samfira
fc77a4b735 Update go-github and garm-provider-common
We need to abstract away the tools struct and not have garm-provider-common
depend on go-github just for that one struct. It makes it hard to update
go-github without updating garm-provider-common first and then all the rest
of the providers.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-09-24 07:56:56 +00:00
Gabriel Adrian Samfira
a26907fb91 Add root CA bundle metadata URL
Thic change adds a metadata endpoint that returns a list of root CA
certificates a runner must install in order to be able to validate all
relevant API endpoints it may require. This includes any GHES API that
runs on a self signed certificate.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-28 09:44:18 +00:00
Gabriel Adrian Samfira
d700b790ac Update garm-provider-common and go-github
* Updates the garm-provider-common and go-github packages.
* Update sqlToParamsInstance to return an error when unmarshaling

This change is needed to pull in the new Seal/Unseal functions in common.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-28 08:13:44 +00:00
Gabriel Adrian Samfira
779afe980e
Add webhook show, return info and some fixes
* Added a webhook show command. This gives us info about the webhook and
    if it is installed.
  * Return webhook info when installing the webhook
  * Small typo fixes.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
7ce3f007b0
Add functions to (un)install webhooks for orgs and repos
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
f2796f1d5a
Add admin required middleware and webhook endpoint
* Add a new middleware that tests for admin access
  * Add a new controller ID suffixed webhook endpoint. This will be used
    to accept webhook events on a webhook URL that is suffixed with our own
    controller ID.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
99539edde7 Add controller info
This change adds a new controller info endpoint and associated client and
CLI command. The controller info endpoint returns information about controller
status and configuration.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-12 22:47:50 +00:00
Gabriel Adrian Samfira
da13cec2de Move code to external package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-21 15:34:18 +00:00
Mihaela Balutoiu
802c7527fe Add more swagger annotations for apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-19 18:06:13 +03:00
Mihaela Balutoiu
f13d19b4ec Add more functionality to swagger client library
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-18 15:19:53 +03:00
Mihaela Balutoiu
98e415bf11 Add more swagger annotations to apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-07-17 12:00:51 +03:00
Ionut Balutoiu
ca878507b5 Add more functionality to swagger client library
* Update `go:generate` annotations to use stable swagger tag instead
  of relying on the `swagger` CLI tool already installed.
* Rename the following route IDs from repositories:
    * `Create` -> `CreateRepo`
    * `List` -> `ListRepos`
    * `Get` -> `GetRepo`
    * `Delete` -> `DeleteRepo`
  The swagger CLI spec validation will fail if the route IDs are not unique.
* Fully implement the all the API calls to:
    * `/api/v1/repositories`
    * `/api/v1/instances`

Signed-off-by: Ionut Balutoiu <ibalutoiu@cloudbasesolutions.com>
2023-07-05 13:47:58 +03:00
Gabriel Adrian Samfira
3d26900d32 Set credentials in pool manager
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-04 22:11:45 +00:00
Gabriel Adrian Samfira
a526c1024c Various fixes
* enable foreign key constraints on sqlite
  * on delete cascade for addresses and status messages
  * add debug server config option
  * fix rr allocation

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
a15a91b974 Break lock and lower scale down timeout
Break the lock on a job if it's still queued and the runner that it
triggered was assigned to another job. This may cause leftover runners
to be created, but we scale those down in ~3 minutes.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
1287a93cf2 Add job list to API and CLI
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
fbffd8157b Add job tracking
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Ionut Balutoiu
d122f293cf Add swagger annotations to apiserver
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-06-30 19:03:57 +03:00
Gabriel Adrian Samfira
a9cf5127a9 More granular loops, update go-github
This commit adds:

  * more granular loops for various operations
  * update go-github to latest version
  * skip trying to fetch runner info for canceled or skipped jobs
  * loops use waitgroups to signal exit

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-23 08:16:41 +00:00
Lea Waller
e3065d6951
Allow installing additional packages in lxd 2023-06-11 17:32:30 +02:00
Michael Kuhnt
535c875eb8
add userdata options to allow disabling package updates during boot / cloud-init 2023-04-19 11:40:51 +02:00
Michael Kuhnt
37c8af0fb1
formatting stuff 2023-04-19 11:39:55 +02:00
Gabriel Adrian Samfira
6b3ea50ca5 Add runner group option to pools
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-27 09:21:21 +00:00
Gabriel Adrian Samfira
91e2d7b029
Remove unused param, add OSType to bootstrap param
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-16 12:29:00 +02:00
Gabriel Adrian Samfira
0074af9370
Move some defaults and types from config
The params package should not depend on config. The params packages
should be consumable by external applications that wish to interact with
garm, and it makes no sense to pull in the config package just for some
constants. As such, the following changes have been made:

  * Moved some types from config to params
  * Moved defaults in a new leaf package called appdefaults

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-14 14:15:10 +02:00
Gabriel Adrian Samfira
829db87f15
Rename module
This change renames the module from "garm" to "github.com/cloudbase/garm".

This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-12 16:01:49 +02:00
Gabriel Adrian Samfira
f25951decb Add extra specs on pools
Extra specs is an opaque valid JSON that can be set on a pool and which
will be passed along to the provider as part of instance bootstrap params.

This field is meant to allow operators to send extra configuration values
to external or built-in providers. The extra specs is not interpreted or
useful in any way to garm itself, but it may be useful to the provider
which interacts with the IaaS.

The extra specs are not meant to be used for secrets. Adding sensitive
information to this field is highly discouraged. This field is meant as a    
means to add fine tuning knobs to the providers, on a per pool basis.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-30 13:10:21 +00:00
Michael Kuhnt
6a032bfaa2
metrics: fix review findings 2023-01-26 15:46:27 +01:00
Gabriel Adrian Samfira
f2cf947c00
Move pool type in params
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-20 17:08:15 +02:00
Gabriel Adrian Samfira
abcc9569bd
Add a common RunnerPrefix type
There are several fields that are common among some of the data
structures in garm. The RunnerPrefix is just one of them. Perhaps we
should move some of the rest in a common type and embed that into the
types that share those fields.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-20 12:12:15 +02:00
Michael Kuhnt
6af3025743
feat: allow to configure the runner name 2023-01-19 11:13:36 +01:00
Gabriel Adrian Samfira
a91f64331e Limit instances to one runner token 2022-12-29 22:57:10 +00:00
Gabriel Adrian Samfira
3a92a5be0e Some cleanup and safety checks
* Add logging middleware
  * Remove some noise from logs
  * Add some safety checks when managing runners
2022-12-29 16:50:11 +00:00
Gabriel Adrian Samfira
d3fe741cfe Don't save runner registration tolen in DB
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-12-06 19:48:00 +00:00
Gabriel Adrian Samfira
cb5baeb547 Add Enterprise tests 2022-12-04 17:30:42 +00:00
Gabriel Adrian Samfira
0869073906 Define a metadata subrouter
Define a metadata subrouter and move the token endpoint there. We may
end up needing multiple endpoints for various purposes in the future.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-12-02 19:48:38 +00:00
Gabriel Adrian Samfira
a078645ab2
Add token endpoint
This change adds a github registration endpoint that instances can use
to fetch a github registration token.

This change also invalidates disables access to an instance to the token
and status updates endpoints once the instance transitions from
"pending" or "installing" to any other state.
2022-12-01 18:00:22 +02:00
Gabriel Adrian Samfira
05057e37fd
Start pool managers in the background
Garm no longer fails on startup if a pool manager cannot be started. It
will attempt to start the pool manager in the background. If it fails
due to an unauthorized error, it will sleep for 3 hours. It is unlikely
it will work a second time if credentials are not updated in the config
and garm is restarted, so no point in getting rate limited.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 17:14:03 +03:00
Gabriel Adrian Samfira
80452aac39
Update go-github and remove redirect 2022-10-21 17:14:03 +03:00
Gabriel Adrian Samfira
3e3b91ee59
Add enterprise support to garm-cli
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 17:14:03 +03:00
Gabriel Adrian Samfira
296333412a
Add enterprise support
This change adds enterprise support throughout garm.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 17:14:03 +03:00
Gabriel Adrian Samfira
f40420bfb6
Add ability to specify github enpoints for creds
The GitHub credentials section now allows setting some API endpoints
that point the github client and the runner setup script to the propper
URLs. This allows us to use garm with an on-prem github enterprise server.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-10-21 17:14:03 +03:00
Ionut Balutoiu
7b6c2e6106 Refactor code to allow more unit testing
In order to allow mocking for some of the `runner` functions, we created a
separate interface (called `PoolManagerController`) with `Create`, `Get`,
`Delete` operations for the `organization` / `repository` pool managers.

Furthermore, a new runner struct (`poolManagerCtrl`) implements this new
interface. The existing code is refactored to use the `poolManagerCtrl`
whenever the pool managers for `org` / `repo` are handled.

This allows more unit testing for the runner functions since `poolManagerCtrl`
field can be mocked now.

Besides this, there are some typos fixed as well.
2022-08-18 17:47:05 +03:00
Gabriel Adrian Samfira
15a1308441 Add timeout functionality for pool runner bootstrap
Pools can now define a bootstrap timeout for runners. The timeout can
be defined per pool and indicates the amount of time after which a runner
is considered defunct and removed.

If a runner doesn't join github in the configured amount of time, and it
receives no updates indicating that it is installing the runner via instance
status updates, it is considered defunct.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-06-29 23:44:03 +00:00
Gabriel Adrian Samfira
5390efbaab Add manual runner removal
Runners can now be manually removed using the CLI. Some restrictions apply:

  * A runner must be idle in github. Github will not allow us to remove a runner
that is running a workflow.
  * The runner status must be "running"

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2022-06-29 16:23:01 +00:00
Gabriel Adrian Samfira
dc04bca95c Retry failed runners
* retry adding runners for up to 5 times if they fail.
  * various fixes
2022-05-10 12:28:39 +00:00
Gabriel Adrian Samfira
5e0a64f909 Add license headers 2022-05-05 13:25:50 +00:00