On github, attempt to use the scaleset API to list all runners without
pagination. This will avoid missing runners and accidentally removing them.
Fall back to paginated API if we can't use the scaleset API.
Add ability to retrieve all instances from cache, for an entity.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Github will remove inactive scale sets after 7 days. This change
ensures the scale set exists in github before spinning up the listener.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Although runner names are unique, we still have an ID on the model
which is used as a primary key. We should allow using that ID to
reference a runner in the API.
This change allows users to specify ID or runner name.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
There seems to be a change in the scale set message. It now includes
a jobID and sets the runner request ID to 0. This change adds separate
job ID fields for workflow jobs and scaleset jobs.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Do not look for a name when composing the scale set. Preload may not
have been called on an entity, but we still have the ID, which is the
only thing needed when GetEntity() is called.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
* time.NewTicker will panic if the duration is 0. Make it return
early if duration is 0.
* Return a pre-closed channel in Wait() instead of nil. Ensures receiver
will not block forever.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change renames a lot of variables, types and functions to be more
generic. The goal is to allow GARM to add more forges in the future.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Adds a periodic cleanup function that cross checks runners between github,
the provider and the GARM database. If an inconsistency is found, GARM will
attempt to fix it.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
The locking logic was added to its own package as it may need to be used
by other parts of the code.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change moves the github client to a subpackage in utils
and adds the scaleset github client code.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change adds a backoff mechanism when deleting github runners.
If the delete operation fails, we record the event and retry with
a geometric progression of 1.5 starting from 5 seconds, which is the
pool consolidation timeout.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change scopes all github entities to a github endpoint, allowing
users to have the same repo/org/enterprise created for each endpoint.
This way, if your username is the same on github.com and on your GHES
server, and you have the same repository name or org in both places,
GARM can now handle that situation.
This change also fixes a leaky watcher in the pool manager.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Improper use of time.After can lead to memory leaks if the timer never
gets a chance to fire.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
GARM has a backoff interval when consuming queued jobs. This backoff
is intended to allow any potential idle runners to pick up a job before
GARM attempts to spin up a new one. This change allows users to set a
custom backoff interval or disable it altogether by setting it to 0.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This change uses the database watcher to watch for changes to the
github entities, credentials and controller info.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Adds a simple database watcher. At this point it's just one process, but
the plan is to allow different implementations that inform the local running
workers of changes that have occured on entities of interest in the database.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
There are only a few cases, where we get a job information from github
where the runner name is not set.
For all this cases we do not need to check github API at all because
these jobs are never ever get scheduled to a runner:
job.Action is:
* queued:
a queued job is just queued and not scheduled to a runner so we do
not get a runner name from the GH API
* completed:
when conclusion=cancelled|failure github never scheduled the job to a
runner and with that we do not get a runner name from the GH API
Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
github is sending job events where conclusion=cancelled is spelled in american english.
Signed-off-by: Mario Constanti <mario.constanti@mercedes-benz.com>
Remove code that was just wrapping other functions at this point, and
move some code around. We need to get a better idea what is actually
still needed in the pool manager, to begin to refactor it into something
that can scale out.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
Change instance DB functions from querying by ID to querying by name. Names
are unique in GARM, so we might as well use the name instead of the ID and
spare ourselves the extra query to get the ID when a qorkflow comes in.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
When no runner group is set, do not attempt to resolve the runner group.
Looking for an empty runner group will just return a not found error, which
will make GARM fall back to registration token.
This change fixes that.
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>