Commit graph

89 commits

Author SHA1 Message Date
Gabriel Adrian Samfira
118319c7c1 Switch to fmt.Errorf
Replace all instances of errors.Wrap() with fmt.Errorf.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-08-16 22:19:05 +00:00
Gabriel Adrian Samfira
499fbde60c Add a rudimentary filter option when listing entities
This change adds the ability to filter the list of entities returned
by the API by entity owner, name or endpoint, depending on the entity
type.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-06-18 21:23:34 +00:00
Gabriel Adrian Samfira
b2d5609352 Add some tests
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-05-19 19:45:45 +00:00
Gabriel Adrian Samfira
bb798a288a Properly set webhook secret
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-05-16 23:58:39 +00:00
Gabriel Adrian Samfira
f66b651b59 Fix findEndpointForJob
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-05-14 21:09:02 +00:00
Gabriel Adrian Samfira
40e6581a75 Rename GitHub specific types
This change renames a lot of variables, types and functions to be more
generic. The goal is to allow GARM to add more forges in the future.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-05-12 21:47:13 +00:00
Gabriel Adrian Samfira
884be62a4d Fix lint errors
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-05-03 22:29:40 +00:00
Gabriel Adrian Samfira
f2ad7a3481 Fix leftover instances and refactor
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2025-05-03 22:29:40 +00:00
Gabriel Adrian Samfira
cc6e985629 Fix: Scope entities to endpoint
This change scopes all github entities to a github endpoint, allowing
users to have the same repo/org/enterprise created for each endpoint.

This way, if your username is the same on github.com and on your GHES
server, and you have the same repository name or org in both places,
GARM can now handle that situation.

This change also fixes a leaky watcher in the pool manager.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-07-29 17:35:57 +00:00
Gabriel Adrian Samfira
2554f70b89 Replace time.After with time.NewTimer
Improper use of time.After can lead to memory leaks if the timer never
gets a chance to fire.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-07-05 12:55:35 +00:00
Gabriel Adrian Samfira
daaca0bd8f Use watcher and get rid of RefreshState()
This change uses the database watcher to watch for changes to the
github entities, credentials and controller info.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-06-21 13:47:48 +00:00
Gabriel Adrian Samfira
9748aa47af Move URLs from default section of config to DB
This change moves the callback_url, metadata_url and webhooks_url from
the config to the database. The goal is to move as much as possible from
the config to the DB, in preparation for a potential refactor that will
allow GARM to scale out. This would allow multiple nodes to share a single
source of truth.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-06-07 09:27:24 +00:00
Mario Constanti
1d14a26325 feat: garm pools do not force default labels
Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
2024-05-21 11:55:12 +02:00
Gabriel Adrian Samfira
1256473089 Fetch credentials from DB
Do not rely on the entity object to hold updated or detailed credentials,
fetch them from the DB every time.

This change also ensures that we pass in the user context instead of the
runner context to the DB methods.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-04-24 13:59:15 +00:00
Gabriel Adrian Samfira
eadbe784b9 Add github credentials API and cli code
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-04-22 14:08:37 +00:00
Gabriel Adrian Samfira
4610f83209 List credentials from db
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-04-22 14:08:37 +00:00
Gabriel Adrian Samfira
3e60a48ca8 Preload credentials endpoint and remove extra code
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-04-22 14:03:25 +00:00
Gabriel Adrian Samfira
032d40f5f9 Fix tests
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-04-22 14:03:25 +00:00
Gabriel Adrian Samfira
90870c11be Use database for github creds
Add database models that deal with github credentials. This change
adds models for github endpoints (github.com, GHES, etc). This change
also adds code to migrate config credntials to the DB.

Tests need to be fixed and new tests need to be written. This will come
in a later commit.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-04-22 14:03:25 +00:00
Gabriel Adrian Samfira
36288c65e6 Slightly simplify code
Change instance DB functions from querying by ID to querying by name. Names
are unique in GARM, so we might as well use the name instead of the ID and
spare ourselves the extra query to get the ID when a qorkflow comes in.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-30 18:22:06 +00:00
Gabriel Adrian Samfira
9384e37bb1 Fix tests
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-28 18:23:49 +00:00
Gabriel Adrian Samfira
fa75ecfa8e Dedupe more code
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-17 10:59:09 +00:00
Gabriel Adrian Samfira
ce3c917ae5 Add pool balancing strategy
This change adds the ability to specify the pool balancing strategy to
use when processing queued jobs. Before this change, GARM would round-robin
through all pools that matched the set of tags requested by queued jobs.

When round-robin (default) is used for an entity (repo, org or enterprise)
and you have 2 pools defined for that entity with a common set of tags that
match 10 jobs (for example), then those jobs would trigger the creation of
a new runner in each of the two pools in turn. Job 1 would go to pool 1,
job 2 would go to pool 2, job 3 to pool 1, job 4 to pool 2 and so on.

When "stack" is used, those same 10 jobs would trigger the creation of a
new runner in the pool with the highest priority, every time.

In both cases, if a pool is full, the next one would be tried automatically.

For the stack case, this would mean that if pool 2 had a priority of 10 and
pool 1 would have a priority of 5, pool 2 would be saturated first, then
pool 1.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-14 20:04:34 +00:00
Gabriel Adrian Samfira
9a6770c3a3 Allow bypassing Unauthorized error when deleting runner
This change allows users to bypass GitHub Unauthorized errors when removing
github runners. This means that removing runners will now be possible even
if the pool manager is stopped.

There is a new flag added to the runner rm command and to the API that
tells GARM to bypass pool being stopped and any 401 error returned by
GitHub.

This means you will be able to remove the runners from garm and your
provider, but will mean that the runner will still exist in github as
"offline" if the credentials are not updated or the runner manually removed.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-10 15:21:39 +00:00
Gabriel Adrian Samfira
4668461603 Expose the credential type through the API
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-02 17:04:27 +00:00
Gabriel Adrian Samfira
cbb2134f0e Add GitHub App support
This change adds the ability to use GitHub Apps to authenticate against the
GitHub API. This gives us a larger quota for API requests (15k vs 5k for PATs).

Also, each GitHub App has its own quota, whereas PATs share the same user quota.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-03-01 19:47:50 +00:00
Mario Constanti
9e7ac60c09 fix: gosec linter finding
we have to keep crypto/sha1 as github is still using it

Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
2024-02-22 17:34:12 +01:00
Mario Constanti
fd0550eb7f fix: godoc linter findings (TODO comments)
Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
2024-02-22 17:33:19 +01:00
Mario Constanti
ee3a670456 fix: var-naming linter findings
Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
2024-02-22 17:28:39 +01:00
Mario Constanti
9f5c38ef2d fix: unused-parameter linter findings
Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
2024-02-22 16:54:38 +01:00
Gabriel Adrian Samfira
5b735eaaf4 Small adjustments
This change increases the tools refresh interval to 5 minutes, cleans
up the websocket code a bit, augments the error message that may be returned
when trying to delete a runner in an invalid state and removes a log message
that does not add much value.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-01-12 19:53:27 +00:00
Gabriel Adrian Samfira
e441b6ce89 Switch to log/slog
This change switches GARM to the new structured logging standard
library. This will allow us to set log levels and reduce some of
the log spam.

Given that we introduced new knobs to tweak logging, the number of
config options for logging now warrants it's own section.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-01-05 23:46:40 +00:00
Gabriel Adrian Samfira
2a5e2409b2 Add system-info instance callback
Allow runners to update their own system information. Runners can now send
back os_name, os_version and agent_id back as part of a POST to
CALLBACK_URL/system-info/.

The goal is to get better info in regard to the actual OS that's running
and to move the agent_id from the status updates to the system-info callback.

The status updates should be used only to send back info about the status of
the installation process.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2024-01-04 15:23:43 +00:00
Gabriel Adrian Samfira
d09f12dfd8 Add force delete runner
This branch adds the ability to forcefully remove a runner from GARM.

When the operator wishes to manually remove a runner, the workflow is as
follows:

* Check that the runner exists in GitHub. If it does, attempt to
  remove it. An error here indicates that the runner may be processing
  a job. In this case, we don't continue and the operator gets immediate
  feedback from the API.
* Mark the runner in the database as pending_delete
* Allow the consolidate loop to reap it from the provider and remove it
  from the database.

Removing the instance from the provider is async. If the provider errs out,
GARM will keep trying to remove it in perpetuity until the provider succedes.

In situations where the provider is misconfigured, this will never happen, leaving
the instance in a permanent state of pending_delete.

A provider may fail for various reasons. Either credentials have expired, the
API endpoint has changed, the provider is misconfigured or the operator may just
have removed it from the config before cleaning up the runners. While some cases
are recoverable, some are not. We cannot have a situation in which we cannot clean
resources in garm because of a misconfiguration.

This change adds the pending_force_delete instance status. Instances marked with
this status, will be removed from GARM even if the provider reports an error.

The GARM cli has been modified to give new meaning to the --force-remove-runner
option. This option in the CLI is no longer mandatory. Instead, setting it will mark
the runner with the new pending_force_delete status. Omitting it will mark the runner
with the old status of pending_delete.

Fixes: #160

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-10-12 06:15:36 +00:00
Gabriel Adrian Samfira
1268507ce6 Add jit config routes
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-09-24 13:50:20 +00:00
Gabriel Adrian Samfira
591641a8a3 Add temporary redirect to go-github fork
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-09-24 13:49:04 +00:00
Gabriel Adrian Samfira
7ce3f007b0
Add functions to (un)install webhooks for orgs and repos
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira
99539edde7 Add controller info
This change adds a new controller info endpoint and associated client and
CLI command. The controller info endpoint returns information about controller
status and configuration.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-12 22:47:50 +00:00
Gabriel Adrian Samfira
e775c9c11d Move most of util package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-22 22:39:17 +00:00
Gabriel Adrian Samfira
ed651bb7d0 Move errors to external package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-22 22:26:47 +00:00
Gabriel Adrian Samfira
da13cec2de Move code to external package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-21 15:34:18 +00:00
Gabriel Adrian Samfira
fe2cb01528 Slight refactor and fix tests
Updating a pool will no longer try to create a pool manager if one does
not already exist. A pool manager must be started when a pool is created.
Updating an existing pool without a pool manager is an error condition.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-05 09:46:19 +00:00
Gabriel Adrian Samfira
3d26900d32 Set credentials in pool manager
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-04 22:11:45 +00:00
Gabriel Adrian Samfira
4a2ba68867 Use the same db connection in runner
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
fbffd8157b Add job tracking
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
5bca63eeb1 Replace wait implementation with errgroup
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:40:57 +00:00
Gabriel Adrian Samfira
92af290c27
Properly resolve OS tag
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-19 21:28:24 +02:00
Gabriel Adrian Samfira
829db87f15
Rename module
This change renames the module from "garm" to "github.com/cloudbase/garm".

This will make it easier to consume public functions defined in garm, by
external applications, without having to resort to replace.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-03-12 16:01:49 +02:00
Gabriel Adrian Samfira
8d17498ab8
Remove caches, retry fetching the hostname
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-29 16:03:20 +02:00
Gabriel Adrian Samfira
8f56f51598 Move some code around
Move the metrics code into its own package.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-01-27 14:57:25 +00:00