Prevent abusing the GH API

On large deployments with many jobs, we cannot check each job that we recorded in the DB against the GH API. Before this change, if a job was updated more than 10 minutes ago, garm would check against the GH api if that job still existed. While this approach allowed us to maintain a consistent view over which jobs still exist and which are stale, it had the potential of spamming the GH API, leading to rate limiting. This change uses the scale-down loop as an indicator for job staleness. If a job remains in queued state in our DB, but has dissapeared from GH or was serviced by another runner and we never got the hook (garm was down or GH had an issue - happened in the past), then garm will spin up a new runner for it. If that runner or any other runner is scaled down, we check if we have jobs in the queue that should have matched that runner. If we did, there is a high chance that the job no longer exists in GH and we can remove the job from the queue. Of course, there is a chance that GH is having issues and the job is never pushed to the runner, but we can't really account for everything. In this case I'd rather avoid rate limiting ourselves. Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-12-15 22:33:22 +00:00 · 2023-12-15 22:33:22 +00:00 · 459906d97e
commit 459906d97e
parent 46ac1b8166
2 changed files with 39 additions and 42 deletions
--- a/params/params.go
+++ b/params/params.go
@ -297,6 +297,20 @@ func (p *Pool) PoolType() PoolType {
 	return ""
 }

+func (p *Pool) HasRequiredLabels(set []string) bool {
+	asMap := make(map[string]struct{}, len(p.Tags))
+	for _, t := range p.Tags {
+		asMap[t.Name] = struct{}{}
+	}
+
+	for _, l := range set {
+		if _, ok := asMap[l]; !ok {
+			return false
+		}
+	}
+	return true
+}
+
 // used by swagger client generated code
 type Pools []Pool