DevFW-CICD/garm - EDP: Build your thing in minutes

Author	SHA1	Message	Date
Gabriel Adrian Samfira	e2eb8f719b	Check for nil pointer before dereferencing Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-03-26 16:43:09 +02:00
Gabriel Adrian Samfira	ce3c917ae5	Add pool balancing strategy This change adds the ability to specify the pool balancing strategy to use when processing queued jobs. Before this change, GARM would round-robin through all pools that matched the set of tags requested by queued jobs. When round-robin (default) is used for an entity (repo, org or enterprise) and you have 2 pools defined for that entity with a common set of tags that match 10 jobs (for example), then those jobs would trigger the creation of a new runner in each of the two pools in turn. Job 1 would go to pool 1, job 2 would go to pool 2, job 3 to pool 1, job 4 to pool 2 and so on. When "stack" is used, those same 10 jobs would trigger the creation of a new runner in the pool with the highest priority, every time. In both cases, if a pool is full, the next one would be tried automatically. For the stack case, this would mean that if pool 2 had a priority of 10 and pool 1 would have a priority of 5, pool 2 would be saturated first, then pool 1. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-03-14 20:04:34 +00:00
Gabriel Adrian Samfira	7d33e0f0cf	Add job info in runner list This change adds information about the job a runner is currently handling. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-03-11 15:46:18 +00:00
Gabriel Adrian Samfira	9a6770c3a3	Allow bypassing Unauthorized error when deleting runner This change allows users to bypass GitHub Unauthorized errors when removing github runners. This means that removing runners will now be possible even if the pool manager is stopped. There is a new flag added to the runner rm command and to the API that tells GARM to bypass pool being stopped and any 401 error returned by GitHub. This means you will be able to remove the runners from garm and your provider, but will mean that the runner will still exist in github as "offline" if the credentials are not updated or the runner manually removed. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-03-10 15:21:39 +00:00
Gabriel Adrian Samfira	4668461603	Expose the credential type through the API Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-03-02 17:04:27 +00:00
Mario Constanti	7221812dfa	fix: remove unused cobra args Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 17:20:05 +01:00
Mario Constanti	c137f9b662	fix: gocritic linter findings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	742ebabfc3	fix: unneded type conversion Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	b27f523259	fix: ifElseChain linter finding Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	f4e51493f3	fix: var-naming linter findings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	cbe8f09412	fix: run make lint-fix Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	55fe81fe32	fix: gosec linter findings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	e664639e98	fix: var-declaration linter findings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	bd0b27ab10	fix: gci section warnings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	8fc001f5f6	fix: misspell linter warnings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	b3854eaf18	fix: whitespace linter warnings Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-22 15:06:53 +01:00
Mario Constanti	0a53b8f6d8	fix: stop metrics collector ticker on ctx.Done Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-20 14:57:32 +01:00
Mario Constanti	17d74dfbf0	chore: rework prometheus metrics registration fail if metric registration panics Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-20 14:27:27 +01:00
Mario Constanti	3e025dda2f	feat: define a default duration for metrics update Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-20 10:23:48 +01:00
Mario Constanti	97f172eb51	fix: improve metrics collection loop by adding the context from main and make auth.GetAdminContext accepting a context we are now able to stop the metrics collection loop once the context is canceled Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-20 06:33:21 +01:00
Mario Constanti	1d8d9459eb	chore: refactor metrics endpoint refactoring is needed to make the metrics package usable from within the runner package for further metrics. This change also makes the metric-collector independent from requests to the /metrics endpoint Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>	2024-02-19 16:22:32 +01:00
Gabriel Adrian Samfira	eb729320f9	Fix typos in garm-cli Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-02-12 10:05:23 +00:00
Gabriel Adrian Samfira	70bfff96e0	Fix log streamer and cleanup code I accidentally disabled the log streamer when I moved the config options to their own section. This change fixes that. This change also adds some safety checks and locking when cleaning up stale clients. The websocket hub Write() function now copies the message before sending it on the channel to the clients. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-01-06 14:05:38 +00:00
Gabriel Adrian Samfira	d44d64dbfd	Use log_file from logging config Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-01-06 00:32:39 +00:00
Gabriel Adrian Samfira	61e97f0896	Append pool_type and pool_mgr info to logs Pool managers will have 2 fields identifying which manager generated the log line. In the future, we will add tracking ids in various cases, allowing us to track down issues faster. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-01-06 00:21:50 +00:00
Gabriel Adrian Samfira	e441b6ce89	Switch to log/slog This change switches GARM to the new structured logging standard library. This will allow us to set log levels and reduce some of the log spam. Given that we introduced new knobs to tweak logging, the number of config options for logging now warrants it's own section. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2024-01-05 23:46:40 +00:00
Gabriel Adrian Samfira	d09f12dfd8	Add force delete runner This branch adds the ability to forcefully remove a runner from GARM. When the operator wishes to manually remove a runner, the workflow is as follows: * Check that the runner exists in GitHub. If it does, attempt to remove it. An error here indicates that the runner may be processing a job. In this case, we don't continue and the operator gets immediate feedback from the API. * Mark the runner in the database as pending_delete * Allow the consolidate loop to reap it from the provider and remove it from the database. Removing the instance from the provider is async. If the provider errs out, GARM will keep trying to remove it in perpetuity until the provider succedes. In situations where the provider is misconfigured, this will never happen, leaving the instance in a permanent state of pending_delete. A provider may fail for various reasons. Either credentials have expired, the API endpoint has changed, the provider is misconfigured or the operator may just have removed it from the config before cleaning up the runners. While some cases are recoverable, some are not. We cannot have a situation in which we cannot clean resources in garm because of a misconfiguration. This change adds the pending_force_delete instance status. Instances marked with this status, will be removed from GARM even if the provider reports an error. The GARM cli has been modified to give new meaning to the --force-remove-runner option. This option in the CLI is no longer mandatory. Instead, setting it will mark the runner with the new pending_force_delete status. Omitting it will mark the runner with the old status of pending_delete. Fixes: #160 Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-10-12 06:15:36 +00:00
Gabriel Adrian Samfira	fb2bb7fb08	Fix nil pointer dereference Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-23 11:21:15 +00:00
Gabriel Adrian Samfira	3b651afe07	Add optional --install-webhook flag when creating repo/org Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:02 +03:00
Gabriel Adrian Samfira	c641e1d596	Fixup field name Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:02 +03:00
Gabriel Adrian Samfira	1c0ff85a0d	Add flag to toggle webhook management Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:02 +03:00
Gabriel Adrian Samfira	c00048e128	Add optional keepWebhook flag when removing an entity The user can opt to not delete the webhook (if installed) when removing the entity from garm. Garm will only ever try to remove a webhook that exactly matches the URL that is composed of the base webhook URL configured in the config.toml file and the unique controller ID that is generated when the controller is first installed. It should be safe to remove the webhook when the entity is removed. Of course, this behavior can be disabled. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:02 +03:00
Gabriel Adrian Samfira	763eb705d8	Remove webhook when removing an entity & cmd fixes * When removing a repo or org, we uninstall the webhook as well. * Upgrade cobra command and mark "webhook-secret" and "random-webhook-secret" as MarkFlagsOneRequired() Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira	779afe980e	Add webhook show, return info and some fixes * Added a webhook show command. This gives us info about the webhook and if it is installed. * Return webhook info when installing the webhook * Small typo fixes. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira	dbd41f518d	Add CLI webhook enablement Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira	7ce3f007b0	Add functions to (un)install webhooks for orgs and repos Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-22 09:39:01 +03:00
Gabriel Adrian Samfira	99539edde7	Add controller info This change adds a new controller info endpoint and associated client and CLI command. The controller info endpoint returns information about controller status and configuration. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-08-12 22:47:50 +00:00
Mihaela Balutoiu	ff5abf1294	Implement API client functionality in the `garm-cli` Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>	2023-07-25 19:26:00 +03:00
Gabriel Adrian Samfira	e775c9c11d	Move most of util package Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-22 22:39:17 +00:00
Gabriel Adrian Samfira	ed651bb7d0	Move errors to external package Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-22 22:26:47 +00:00
Gabriel Adrian Samfira	da13cec2de	Move code to external package Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-21 15:34:18 +00:00
Gabriel Adrian Samfira	86ed06d6ff	Rename UpdateRepositoryParams to UpdateEntityParams Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-05 00:00:24 +00:00
Gabriel Adrian Samfira	6c6c6636ba	Add org and enterprise update Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-04 23:24:19 +00:00
Gabriel Adrian Samfira	cec1d59991	Add repo update command Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-04 22:27:25 +00:00
Gabriel Adrian Samfira	f92ac2a74f	Lower backoff timer to 1 minute Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira	45dceae88c	Add requested labels to cli Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira	a526c1024c	Various fixes * enable foreign key constraints on sqlite * on delete cascade for addresses and status messages * add debug server config option * fix rr allocation Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira	4a2ba68867	Use the same db connection in runner Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira	a15a91b974	Break lock and lower scale down timeout Break the lock on a job if it's still queued and the runner that it triggered was assigned to another job. This may cause leftover runners to be created, but we scale those down in ~3 minutes. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira	1287a93cf2	Add job list to API and CLI Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>	2023-07-03 07:46:20 +00:00

1 2 3

111 commits