Commit graph

180 commits

Author SHA1 Message Date
Gabriel Adrian Samfira
99539edde7 Add controller info
This change adds a new controller info endpoint and associated client and
CLI command. The controller info endpoint returns information about controller
status and configuration.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-08-12 22:47:50 +00:00
Gabriel Adrian Samfira
3e446a8191 Update docs and deps
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-26 16:03:36 +00:00
Gabriel Adrian Samfira
72f45bba47 Remove unused code
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-24 07:14:29 +00:00
Gabriel Adrian Samfira
7e4dcea056 Update deps
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-23 15:50:49 +00:00
Gabriel Adrian Samfira
e33b64aacb Providers now return ProviderInstance{}
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-23 12:47:56 +00:00
Gabriel Adrian Samfira
2a6b453fc1 Move the exec helper
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-23 12:33:49 +00:00
Gabriel Adrian Samfira
e775c9c11d Move most of util package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-22 22:39:17 +00:00
Gabriel Adrian Samfira
ed651bb7d0 Move errors to external package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-22 22:26:47 +00:00
Gabriel Adrian Samfira
da13cec2de Move code to external package
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-21 15:34:18 +00:00
Gabriel
3fe5d510fe
Merge pull request #124 from gabriel-samfira/fix-entity-update
Fix entity update
2023-07-05 13:39:07 +03:00
Gabriel Adrian Samfira
fe2cb01528 Slight refactor and fix tests
Updating a pool will no longer try to create a pool manager if one does
not already exist. A pool manager must be started when a pool is created.
Updating an existing pool without a pool manager is an error condition.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-05 09:46:19 +00:00
Gabriel Adrian Samfira
86ed06d6ff Rename UpdateRepositoryParams to UpdateEntityParams
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-05 00:00:24 +00:00
Gabriel Adrian Samfira
c162bde6cb Set pool manager status
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-04 23:05:01 +00:00
Gabriel Adrian Samfira
3d26900d32 Set credentials in pool manager
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-04 22:11:45 +00:00
Gabriel Adrian Samfira
a41eeb6f1e Update comment on function
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-04 10:48:14 +00:00
Gabriel Adrian Samfira
6c06afb8e8 Don't add aditional labels to GH runner
For now, the aditional labels would only contain the job ID that triggered
the creation of the runner. It does not make sense to add this label to the
actual runner that registeres against github. We can simply use it internally
by fetching it from the DB.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:48:22 +00:00
Gabriel Adrian Samfira
0ab8f73bb4 Use r.log()
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
9101cdc0a2 Lower the tool update interval to 1 minute
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
f92ac2a74f Lower backoff timer to 1 minute
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
7f510ec40a Check if we have a recorded job
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
a526c1024c Various fixes
* enable foreign key constraints on sqlite
  * on delete cascade for addresses and status messages
  * add debug server config option
  * fix rr allocation

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
f7cf6bb619 increase backoff to 30 seconds
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
3796c25228 Amend some log messages
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
bf90eb323a Add back update locks
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
4a2ba68867 Use the same db connection in runner
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
b52f107bde Update log messages
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
c04a93dde9 Add basic round robin for pools
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
4b9c20e1be Reduce timeout to 10 seconds
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
28360fd662 Do not record jobs not meant for us
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
a15a91b974 Break lock and lower scale down timeout
Break the lock on a job if it's still queued and the runner that it
triggered was assigned to another job. This may cause leftover runners
to be created, but we scale those down in ~3 minutes.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
1287a93cf2 Add job list to API and CLI
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
b6a02db446 Remove completed jobs and slight optimization
* Removes completed jobs from the db
  * Skip ensure min idle runners for pools with min idle runners set to 0

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
5153738359 Small fixes
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
fbffd8157b Add job tracking
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:46:20 +00:00
Gabriel Adrian Samfira
5bca63eeb1 Replace wait implementation with errgroup
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-07-03 07:40:57 +00:00
Gabriel Adrian Samfira
67b871488d Log the actual error
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-30 09:14:10 +03:00
Gabriel Adrian Samfira
0a27acd818 Remove extra loop and add logging
* removes an extra loop. The fetch tools loop does the same job
  * add a lot of log messages

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-30 08:52:16 +03:00
Gabriel Adrian Samfira
7358beb2b9 Merge Unlock() and UnlockAndDelete()
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-30 08:48:29 +03:00
Gabriel Adrian Samfira
1edb9247a8 Add per instance mux
Lock operations per instance name. This should avoid go routines trying
to update the same instance when operations may be slow.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-23 15:43:31 +00:00
Gabriel Adrian Samfira
a9cf5127a9 More granular loops, update go-github
This commit adds:

  * more granular loops for various operations
  * update go-github to latest version
  * skip trying to fetch runner info for canceled or skipped jobs
  * loops use waitgroups to signal exit

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-23 08:16:41 +00:00
Gabriel Adrian Samfira
4921692ee2 Wrap errgroup in select
This commit:

  * swaps WaitGroups with errgroups
  * wraps errgroup.Wait() in a select to prevent situations in which an
    operation takes a long time and prevents garm from being restarted.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-23 01:07:55 +03:00
Mihaela Balutoiu
d698e2815e Fix runner/pools.go typo
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-06-21 10:56:51 +03:00
Mihaela Balutoiu
00c0ada0aa Add more runner/pools.go unit tests
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-06-19 23:39:07 +03:00
Mihaela Balutoiu
7ac2455379 Fix runner/pools.go typo
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-06-15 14:12:07 +03:00
Gabriel
05168ac994
Merge pull request #106 from gabriel-samfira/reap-failed-agent
Reap failed agent
2023-06-14 22:19:20 +03:00
Mihaela Balutoiu
e1ad300f79 Add test cases for the runner/pools.go
Signed-off-by: Mihaela Balutoiu <mbalutoiu@cloudbasesolutions.com>
2023-06-14 14:17:47 +03:00
Gabriel Adrian Samfira
0e637c10e3
Remove failed runner and some retries
* When a runner fails to set up the github agent, we reap it after the
pool timeout is reached.
  * add a retry in the userdata when configuring the runner agent

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-13 21:01:40 +03:00
Lea Waller
e3065d6951
Allow installing additional packages in lxd 2023-06-11 17:32:30 +02:00
Gabriel Adrian Samfira
06745eb88a
Validate the result returned by providers
Providers may return only 3 possible statuses:

  * InstanceRunning
  * InstanceError
  * InstanceStopped

Every other status is reserved for the controller to set. Provider
responses will be split from the instance response in a future commit.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-11 15:02:36 +03:00
Gabriel Adrian Samfira
d8ed55288f Small comment change
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-06-06 14:55:08 +00:00