GitHub Actions Runner Manager
Find a file
Gabriel Adrian Samfira 459906d97e Prevent abusing the GH API
On large deployments with many jobs, we cannot check each job that
we recorded in the DB against the GH API.

Before this change, if a job was updated more than 10 minutes ago,
garm would check against the GH api if that job still existed. While
this approach allowed us to maintain a consistent view over which jobs
still exist and which are stale, it had the potential of spamming the
GH API, leading to rate limiting.

This change uses the scale-down loop as an indicator for job staleness.

If a job remains in queued state in our DB, but has dissapeared from GH
or was serviced by another runner and we never got the hook (garm was down
or GH had an issue - happened in the past), then garm will spin up a new
runner for it. If that runner or any other runner is scaled down, we check
if we have jobs in the queue that should have matched that runner. If we did,
there is a high chance that the job no longer exists in GH and we can remove
the job from the queue.

Of course, there is a chance that GH is having issues and the job is never
pushed to the runner, but we can't really account for everything. In this case
I'd rather avoid rate limiting ourselves.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
2023-12-15 22:41:50 +00:00
.github/workflows Refactor integration E2E tests 2023-08-24 15:22:46 +03:00
apiserver Add force delete runner 2023-10-12 06:15:36 +00:00
auth Add temporary redirect to go-github fork 2023-09-24 13:49:04 +00:00
client Add force delete runner 2023-10-12 06:15:36 +00:00
cmd Add force delete runner 2023-10-12 06:15:36 +00:00
config Add option to disable JIT config 2023-12-11 12:37:33 +00:00
contrib Rotate log file on SIGHUP 2023-06-27 20:04:20 +03:00
database Add force delete runner 2023-10-12 06:15:36 +00:00
doc feat: add new metrics 2023-10-06 10:21:56 +02:00
internal/testing Rename module 2023-03-12 16:01:49 +02:00
metrics feat: add new metrics 2023-10-06 10:21:56 +02:00
params Prevent abusing the GH API 2023-12-15 22:41:50 +00:00
runner Prevent abusing the GH API 2023-12-15 22:41:50 +00:00
scripts Enable Windows builds 2023-08-18 14:46:00 +00:00
test/integration Fix linting issues 2023-11-21 20:52:40 +02:00
testdata Update sample config 2023-12-11 14:05:10 +00:00
util Replace deprecated function call 2023-09-24 08:04:00 +00:00
vendor Add force delete runner 2023-10-12 06:15:36 +00:00
websocket Fixed a bunch of linting issues 2023-01-20 22:21:22 +02:00
.gitignore Refactor integration E2E tests 2023-08-24 15:22:46 +03:00
Dockerfile Enable Windows builds 2023-08-18 14:46:00 +00:00
Dockerfile.build-static Enable Windows builds 2023-08-18 14:46:00 +00:00
go.mod feat: build garm with go 1.21 2023-12-01 11:56:06 +01:00
go.sum feat: build garm with go 1.21 2023-12-01 11:56:06 +01:00
LICENSE Initial commit 2022-04-13 19:45:01 +03:00
Makefile Enable Windows builds 2023-08-18 14:46:00 +00:00
README.md Merge pull request #191 from gabriel-samfira/update-readme 2023-12-11 17:53:32 +02:00

GitHub Actions Runner Manager (GARM)

Go Tests

Welcome to GARM!

GARM enables you to create and automatically maintain pools of self-hosted GitHub runners, with auto-scaling that can be used inside your github workflow runs.

The goal of GARM is to be simple to set up, simple to configure and simple to use. It is a single binary that can run on any GNU/Linux machine without any other requirements other than the providers it creates the runners in. It is intended to be easy to deploy in any environment and can create runners in any system you can write a provider for. There is no complicated setup process and no extremely complex concepts to understand. Once set up, it's meant to stay out of your way.

GARM supports creating pools on either GitHub itself or on your own deployment of GitHub Enterprise Server. For instructions on how to use GARM with GHE, see the credentials section of the documentation.

Through the use of providers, GARM can create runners in a variety of environments using the same GARM instance. Want to create pools of runners in your OpenStack cloud, your Azure cloud and your Kubernetes cluster? No problem! Just install the appropriate providers, configure them in GARM and you're good to go. Create zero-runner pools for instances with high costs (large VMs, GPU enabled instances, etc) and have them spin up on demand, or create large pools of k8s backed runners that can be used for your CI/CD pipelines at a moment's notice. You can mix them up and create pools in any combination of providers or resource allocations you want.

Join us on slack

Whether you're running into issues or just want to drop by and say "hi", feel free to join us on slack.

slack

Installing

On virtual or physical machines

Check out the quickstart document for instructions on how to install GARM. If you'd like to build from source, check out the building from source document.

On Kubernetes

Thanks to the efforts of the amazing folks at @mercedes-benz, GARM can now be integrated into k8s via their operator. Check out the GARM operator for more details.

Supported providers

GARM has a built-in LXD provider that you can use out of the box to spin up runners on any machine that runs either a stand-alone LXD instance, or an LXD cluster. The quick start guide mentioned above will get you up and running with the LXD provider.

GARM also supports external providers for a variety of other targets.

Installing external providers

External providers are binaries that GARM calls into to create runners in a particular IaaS. There are currently two external providers available:

Follow the instructions in the README of each provider to install them.

Configuration

The GARM configuration is a simple toml. The sample config file in the testdata folder is fairly well commented and should be enough to get you started. The configuration file is split into several sections, each of which is documented in its own page. The sections are:

Optimizing your runners

If you would like to optimize the startup time of new instance, take a look at the performance considerations page.

Write your own provider

The providers are interfaces between GARM and a particular IaaS in which we spin up GitHub Runners. These providers can be either native or external. The native providers are written in Go, and must implement the interface defined here. External providers can be written in any language, as they are in the form of an external executable that GARM calls into.

There is currently one native provider for LXD and several external providers linked above.

If you want to write your own provider, you can choose to write a native one, or implement an external one. I encourage you to opt for an external provider, as those are the easiest to write and you don't need to merge it in GARM itself to be able to use. Faster to write, faster to iterate. The LXD provider may at some point be split from GARM into it's own external project, at which point we will remove the native provider interface and only support external providers.

Please see the Writing an external provider document for details. Also, feel free to inspect the two available sample external providers in this repository.