Commit graph

131 commits

Author SHA1 Message Date
Gabriel Adrian Samfira
d8ed55288f Small comment change
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 14:55:08 +00:00
Gabriel Adrian Samfira
829933559d Allow installing runners to finish
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 14:49:28 +00:00
Gabriel Adrian Samfira
bd2f103743
Wait for addPendingInstances to finish
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 16:40:27 +03:00
Gabriel Adrian Samfira
e9f66c2035
Wait for deletePendingInstances() to finish
Use an errgroup to wait for all instance deletion operations before
returning. Log any failure.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 16:36:08 +03:00
Gabriel Adrian Samfira
05a79d298c
Move some code around
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 16:31:30 +03:00
Gabriel Adrian Samfira
9efefc0d6a
Parallelization and LXD timeouts
* cleanupOrphanedGithubRunners() now uses errgroup to parallelize and
    report errors when removing runners from the provider.
  * retryFailedInstancesForOnePool() now uses errgroup
  * Removed some setPoolRunningState which should be treated in the loop
    where those errors eventually bubble up and can be handled.
  * Added a number of timeouts in the LXD provider for delete and list
    instances. This provider should be converted into an external
    provider.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 16:07:07 +03:00
Gabriel Adrian Samfira
88a39220f5 Allow disabling updates
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-04 22:28:50 +00:00
Gabriel Adrian Samfira
132823b453 Let CreateInstance download and cache image
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-04 21:20:22 +00:00
Gabriel Adrian Samfira
234095e456
Fix pool ID check when listing instances
We should be looking for the poolIDKey in the extended LXD config.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-04 15:43:03 +03:00
Gabriel Adrian Samfira
a433bede96 Return only enabled pools
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-31 14:47:27 +00:00
Gabriel Adrian Samfira
6d4f297097 Fix error type passed into errors.As
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-31 04:36:16 +00:00
Gabriel Adrian Samfira
243ae75476 Properly handle stopped runners
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-27 15:02:25 +00:00
Gabriel Adrian Samfira
6b3ea50ca5 Add runner group option to pools
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-27 09:21:21 +00:00
Gabriel Adrian Samfira
80e8f6dc1e
Add some exit codes
The external provider needs a simple way to indicate certain types of
errors. Duplicate error and not found error are such an example.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-26 22:31:55 +03:00
Gabriel Adrian Samfira
e7f208367b
Error if we can't remove instance from provider
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-26 22:19:20 +03:00
Gabriel Adrian Samfira
c6366ab896
Add Windows userdata
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-20 20:10:48 +02:00
Gabriel Adrian Samfira
da5e8f9719
Fall back to info saved in tags
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-20 10:50:08 +02:00
Gabriel Adrian Samfira
92af290c27
Properly resolve OS tag
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-19 21:28:24 +02:00
Gabriel Adrian Samfira
a56ab9609a
Add Windows as a supported OS
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-19 19:29:01 +02:00
Gabriel Adrian Samfira
91e2d7b029
Remove unused param, add OSType to bootstrap param
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-16 12:29:00 +02:00
Gabriel Adrian Samfira
b17c921a7c
Add Run() function
* Add Run() helper for external providers
  * Make GARM_CONTROLLER_ID env var common to all commands

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-15 01:35:42 +02:00
Gabriel Adrian Samfira
e29d5db72c
Properly detect if we have something on stdin
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-14 20:20:26 +02:00
Gabriel Adrian Samfira
223477c4dd
Define interface for external providers
This interface is similar to the common.Provider interface, but lacks
the AsParams() function. Decoupling the external provider interface from
the internal provider interface allows us to account for any
particularities there may appear between them.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-14 19:49:54 +02:00
Gabriel Adrian Samfira
882d07a0da
Add execution environment to external provider
The execution package is a common package that can be used by external
providers to load environment variables and stdin, in a coherent struct
that can be consumed by the various commands that need to execute as
part of the provider.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-14 18:28:31 +02:00
Gabriel Adrian Samfira
0074af9370
Move some defaults and types from config
The params package should not depend on config. The params packages
should be consumable by external applications that wish to interact with
garm, and it makes no sense to pull in the config package just for some
constants. As such, the following changes have been made:

  * Moved some types from config to params
  * Moved defaults in a new leaf package called appdefaults

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-14 14:15:10 +02:00
Gabriel Adrian Samfira
829db87f15
Rename module
This change renames the module from "garm" to "github.com/cloudbase/garm".

This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-12 16:01:49 +02:00
Gabriel
24f61ceb8c
Merge pull request #78 from gabriel-samfira/add-scale-down-grace-period
Add grace period to scale-down
2023-02-08 15:16:28 +02:00
Gabriel Adrian Samfira
43d2fd8c2d
Add grace period to scale-down
Add a grace period for idle runners of 5 minutes. A new idle runner will
not be taken into consideration for scale-down unless it's older than 5
minutes. This should prevent situations where the scaleDown() routine
that runs every minute will evaluate candidates for reaping and
erroneously count the new one as well. The in_progress hooks that
transitiones an idle runner to "active" may arive a long while after the
"queued" hook has spun up a runner.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-02-07 13:36:15 +02:00
Gabriel
439eeee479
Update runner/pool/pool.go
Co-authored-by: Michael Kuhnt <maigl@users.noreply.github.com>
2023-02-07 13:14:28 +02:00
Gabriel Adrian Samfira
77307998ea
Bail if we fail to cleanup failed instance
if we fail to cleanup failed instance, we return before retrying to
recreate it.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-02-06 14:53:23 +02:00
Michael Kuhnt
4eb8d905ab
fix: skip spawn new runners if enough idle runner available 2023-01-31 16:38:04 +01:00
Gabriel Adrian Samfira
d00da32375 Deduplicate some code
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-30 14:29:55 +00:00
Gabriel Adrian Samfira
f25951decb Add extra specs on pools
Extra specs is an opaque valid JSON that can be set on a pool and which
will be passed along to the provider as part of instance bootstrap params.

This field is meant to allow operators to send extra configuration values
to external or built-in providers. The extra specs is not interpreted or
useful in any way to garm itself, but it may be useful to the provider
which interacts with the IaaS.

The extra specs are not meant to be used for secrets. Adding sensitive
information to this field is highly discouraged. This field is meant as a    
means to add fine tuning knobs to the providers, on a per pool basis.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-30 13:10:21 +00:00
Gabriel Adrian Samfira
8d17498ab8
Remove caches, retry fetching the hostname
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-29 16:03:20 +02:00
Gabriel Adrian Samfira
4d071b7d10 Return only alpha numeric characters as an ID
On some providers the default character set used by shortid may lead to
errors when creating runners, due to the fact that underscores are not
allowed in their names.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-27 14:57:25 +00:00
Gabriel Adrian Samfira
8f56f51598 Move some code around
Move the metrics code into its own package.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-27 14:57:25 +00:00
Michael Kuhnt
6cd18ff1fd
improve access to controller info 2023-01-26 22:13:18 +01:00
Michael Kuhnt
6a032bfaa2
metrics: fix review findings 2023-01-26 15:46:27 +01:00
Michael Kuhnt
ee659f509f
feat: add prometheus metrics & endpoint 2023-01-26 14:15:16 +01:00
mgoeppe
dce1808860 fixed runner tests 2023-01-24 08:51:25 +01:00
mgoeppe
f9f917ba05 aligned code enterprises,organizations and repositories and fixed sql tests 2023-01-24 08:51:25 +01:00
Gabriel Adrian Samfira
e93b6d73e5
Sanitize log entries
While most of these log entries come from either github or our own
database, it's still a good idea to sanitize them.
2023-01-23 18:01:46 +02:00
Gabriel Adrian Samfira
b354cedf7e
Fixed a bunch of linting issues
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-20 22:21:22 +02:00
Gabriel Adrian Samfira
f2cf947c00
Move pool type in params
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-20 17:08:15 +02:00
Gabriel Adrian Samfira
abcc9569bd
Add a common RunnerPrefix type
There are several fields that are common among some of the data
structures in garm. The RunnerPrefix is just one of them. Perhaps we
should move some of the rest in a common type and embed that into the
types that share those fields.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-20 12:12:15 +02:00
Michael Kuhnt
49762d7f9e
fix: shortid generates correctly now 2023-01-19 14:23:40 +01:00
Michael Kuhnt
6af3025743
feat: allow to configure the runner name 2023-01-19 11:13:36 +01:00
Michael Kuhnt
3a46a9d127
feat: scale down idle runners 2023-01-10 18:18:28 +01:00
Gabriel Adrian Samfira
b954038624 Ensure loop closes properly and provider update
* Ensure the pool loop exits properly when the pool is not yet in
a running state.
  * Use ListInstances() when cleaning orphaned runners. This ensures
We only run one API call per pool to list instances, instead of running
a GetInstance() for each individual instance we are checking.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-08 16:40:42 +00:00
Gabriel Adrian Samfira
d5f5524934 Wait for loop exit and some fixes
* Wait for http server graceful shutdown and for pool managers to
properly exit.
  * Fix potential nil pointer dereference when checking response
code from github API.
2022-12-30 15:07:40 +00:00