This document will walk you through the various commands and options available in GARM. It is assumed that you have already installed GARM and have it running. If you haven't, please check out the [quickstart](/doc/quickstart.md) document for instructions on how to install GARM.
While using the GARM cli, you will most likely spend most of your time listing pools and runners, but we will cover most of the available commands and options. Some of them we'll skip (like the `init` or `profile` subcommands), as they've been covered in the [quickstart](/doc/quickstart.md) document.
The `controller` is essentially GARM itself. Every deployment of GARM will have its own controller ID which will be used to tag runners in github. The controller is responsible for managing runners, webhooks, repositories, organizations and enterprises. There are a few settings at the controller level which you can tweak, which we will cover below.
*`Controller ID` - This is the unique identifier of the controller. Each GARM installation, on first run will automatically generate a unique controller ID. This is important for several reasons. For one, it allows us to run several GARM controllers on the same repos/orgs/enterprises, without accidentally clashing with each other. Each runner started by a GARM controller, will be tagged with this controller ID in order to easily identify runners that we manage.
*`Metadata URL` - This URL is configured by the user, and is the URL that is presented to the runners via userdata when they get set up. Runners will connect to this URL and retrieve information they might need to set themselves up. GARM cannot automatically determine this URL, as it is dependent on the user's network setup. GARM may be hidden behind a load balancer or a reverse proxy, in which case, the URL by which the GARM controller can be accessed may be different than the IP addresses that are locally visible to GARM. Runners must be able to connect to this URL.
*`Callback URL` - This URL is configured by the user, and is the URL that is presented to the runners via userdata when they get set up. Runners will connect to this URL and send status updates and system information (OS version, OS name, github runner agent ID, etc) to the controller. Runners must be able to connect to this URL.
*`Webhook Base URL` - This is the base URL for webhooks. It is configured by the user in the GARM config file. This URL can be called into by GitHub itself when hooks get triggered by a workflow. GARM needs to know when a new job is started in order to schedule the creation of a new runner. Job webhooks sent to this URL will be recorded by GARM and acted upon. While you can configure this URL directly in your GitHub repo settings, it is advised to use the `Controller Webhook URL` instead, as it is unique to each controller, and allows you to potentially install multiple GARM controller inside the same repo. Github must be able to connect to this URL.
*`Controller Webhook URL` - This is the URL that GitHub will call into when a webhook is triggered. This URL is unique to each GARM controller and is the preferred URL to use in order to receive webhooks from GitHub. It serves the same purpose as the `Webhook Base URL`, but is unique to each controller, allowing you to potentially install multiple GARM controllers inside the same repo. Github must be able to connect to this URL.
*`Minimum Job Age Backoff` - This is the job age in seconds, after which GARM will consider spinning up a new runner to handle it. By default GARM waits for 30 seconds after receiving a new job, before it spins up a runner. This delay is there to allow any existing idle runners (managed by GARM or not) to pick up the job, before reacting to it. This way we avoid being too eager and spin up a runner for a job that would have been picked up by an existing runner anyway. You can set this to 0 if you want GARM to react immediately.
*`Version` - This is the version of GARM that is running.
These URLs depend heavily on how GARM was set up and what the network topology of the user is set up. GARM may be behind a NAT or reverse proxy. There may be different hostnames/URL paths set up for each of the above, etc. The short of it is that we cannot determine these URLs reliably and we must ask the user to tell GARM what they are.
We can assume that the URL that the user logs in at to manage garm is the same URL that the rest of the URLs are present at, but that is just an assumption. By default, when you initialize GARM for the first time, we make this assumption to make things easy. It's also safe to assume that most users will do this anyway, but in case you don't, you will need to update the URLs in the controller and tell GARM what they are.
In the previous section we saw that most URLs were set to `https://garm.example.com`. The URL path was the same as the routes that GARM sets up. For example, the `metadata_url` has `/api/v1/metadata`. The `callback_url` has `/api/v1/callbacks` and the `webhook_url` has `/webhooks`. This is the default setup and is what most users will use.
If you need to update these URLs, you can use the following command:
The `Controller Webhook URL` you saw in the previous section is automatically calculated by GARM and is essentially the `webhook_url` with the controller ID appended to it. This URL is unique to each controller and is the preferred URL to use in order to receive webhooks from GitHub.
After updating the URLs, make sure that they are properly routed to the appropriate API endpoint in GARM **and** that they are accessible by the interested parties (runners or github).
Endpoints are the way that GARM identifies where the credentials and entities you create are located and where the API endpoints for the GitHub API can be reached, along with a possible CA certificate that validates the connection. There is a default endpoint for `github.com`, so you don't need to add it, unless you're using GHES.
GARM needs access to your GitHub repositories, organizations or enterprise in order to manage runners. This is done via a [GitHub personal access token or via a GitHub App](/doc/github_credentials.md). You can configure multiple tokens or apps with access to various repositories, organizations or enterprises, either on GitHub or on GitHub Enterprise Server.
To add each of these types of credentials, slightly different command line arguments (obviously) are required. I'm going to give you an example of both.
> **NOTE**: You may not delete credentials that are currently associated with a repository, organization or enterprise. You will need to first replace the credentials on the entity, and then you can delete the credentials.
To add a new repository we need to use credentials that has access to the repository. We've listed credentials above, so let's add our first repository:
Lets break down the command a bit and explain what happened above. We added a new repository to GARM, that belogs to the user `gabriel-samfira` and is called `garm`. When using GitHub, this translates to `https://github.com/gabriel-samfira/garm`.
As part of the above command, we used the credentials called `gabriel` to authenticate to GitHub. If those credentials didn't have access to the repository, we would have received an error when adding the repo.
The other interesting bit about the above command is that we automatically added the `webhook` to the repository and generated a secure random secret to authenticate the webhooks that come in from GitHub for this new repo. Any webhook claiming to be for the `gabriel-samfira/garm` repo, will be validated against the secret that was generated.
Another important aspect to remember is that once the entity (in this case a repository) is created, the credentials associated with the repo at creation time, dictates the GitHub endpoint in which this repository exists.
When updating credentials for this entity, the new credentials **must** be associated with the same endpoint as the old ones. An error is returned if the repo is associated with `github.com` but the new credentials you're trying to set are associated with a GHES endpoint.
> **NOTE**: GARM will not remove a webhook that points to the `Base Webhook URL`. It will only remove webhooks that are namespaced to the running controller.
Adding a new organization is similar to adding a new repository. You need to use credentials that have access to the organization, and you can add the organization to GARM using the following command:
This will add the organization `gsamfira` to GARM, and install a webhook for it. The webhook will be validated against the secret that was generated. The only difference between adding an organization and adding a repository is that you use the `organization` subcommand instead of the `repository` subcommand, and the `--name` option represents the `name` of the organization.
Managing webhooks for organizations is similar to managing webhooks for repositories. You can *list*, *show*, *install* and *uninstall* webhooks for organizations using the `garm-cli organization webhook` subcommand. We won't go into details here, as it's similar to managing webhooks for repositories.
All the other operations that exist on repositories, like listing, removing, etc, also exist for organizations and enterprises. Check out the help for the `garm-cli organization` subcommand for more details.
Enterprises are a bit special. Currently we don't support managing webhooks for enterprises, mainly because the level of access that would be required to do so seems a bit too much to enable in GARM itself. And considering that you'll probably ever only have one enterprise with multiple organizations and repositories, the effort/risk to benefit ratio makes this feature not worth implementing at the moment.
To add an enterprise to GARM, you can use the following command:
The `name` of the enterprise is the ["slug" of the enterprise](https://docs.github.com/en/enterprise-cloud@latest/admin/managing-your-enterprise-account/creating-an-enterprise-account).
You will then have to manually add the `Controller Webhook URL` to the enterprise in the GitHub UI.
All the other operations that exist on repositories, like listing, removing, etc, also exist for organizations and enterprises. Have a look at the help for the `garm-cli enterprise` subcommand for more details.
At that point the enterprise will be added to GARM and you can start managing runners for it.
Webhook management is available for repositories and organizations. I'm going to show you how to manage webhooks for a repository, but the same commands apply for organizations. See `--help` for more details.
When we added the repository in the previous section, we specified the `--install-webhook` and the `--random-webhook-secret` options. These two options automatically added a webhook to the repository and generated a random secret for it. The `webhook` URL that was used, will correspond to the `Controller Webhook URL` that we saw earlier when we listed the controller info. Let's list it and see what it looks like:
```bash
ubuntu@garm:~$ garm-cli repository webhook show be3a0673-56af-4395-9ebf-4521fea67567
We can see that it's active, and the events to which it subscribed.
The `--install-webhook` and `--random-webhook-secret` options are convenience options that allow you to quickly add a new repository to GARM and have it ready to receive webhooks from GitHub. As long as you configured the URLs correctly (see previous sections for details), you should see a green checkmark in the GitHub settings page, under `Webhooks`.
If you don't want to install the webhook, you can add the repository without it, and then install it later using the `garm-cli repository webhook install` command (which we'll show in a second) or manually add it in the GitHub UI.
To uninstall a webhook from a repository, you can use the following command:
To allow GARM to manage webhooks, the PAT or app you're using must have the `admin:repo_hook` and `admin:org_hook` scopes (or equivalent). Webhook management is not available for enterprises. For enterprises you will have to add the webhook manually.
To manually add a webhook, see the [webhooks](/doc/webhooks.md) section.
Now that we have a repository, organization or enterprise added to GARM, we can create a runner pool for it. A runner pool is a collection of runners of the same type, that are managed by GARM and are used to run workflows for the repository, organization or enterprise.
You can create multiple pools of runners for the same entity (repository, organization or enterprise), and you can create multiple pools of runners, each pool defining different runner types. For example, you can have a pool of runners that are created on AWS, and another pool of runners that are created on Azure, k8s, LXD, etc. For repositories or organizations with complex needs, you can set up a number of pools that cover a wide range of needs, based on cost, capability (GPUs, FPGAs, etc) or sheer raw computing power. You don't have to pick just one, especially since managing all of them is done using the exact same commands, as we'll show below.
Before we create a pool, we have to decide which provider we want to use. We've listed the providers above, so let's pick one and create a pool of runners for our repository. For the purpose of this example, we'll use the `incus` provider. We'll show you how to create a pool using this provider, but keep in mind that adding another pool using a different provider is done using the exact same commands. The only difference will be in the `--image`, `--flavor` and `--extra-specs` options that you'll use when creating the pool.
Out of those three options, only the `--image` and `--flavor` are mandatory. The `--extra-specs` flag is optional and is used to pass additional information to the provider when creating the pool. The `--extra-specs` option is provider specific, and you'll have to consult the provider documentation to see what options are available.
Let's unpack the command and explain what happened above. We added a new pool of runners to GARM, that belongs to the `gabriel-samfira/garm` repository. We used the `incus` provider to create the pool, and we specified the `--image` and `--flavor` options to tell the provider what kind of runners we want to create. On Incus and LXD, the flavor maps to a `profile`. The profile can specify the resources allocated to a container or VM (RAM, CPUs, disk space, etc). The image maps to an incus or LXD image, as you would normally use when spinning up a new container or VM using the `incus launch` command.
We also specified the `--min-idle-runners` option to tell GARM to always keep at least 1 runner idle in the pool. This is useful for repositories that have a lot of workflows that run often, and we want to make sure that we always have a runner ready to pick up a job.
If we review the output of the command, we can see that the pool was created with a maximum number of 5 runners. This is just a default we can tweak when creating the pool, or later using the `garm-cli pool update` command. We can also see that the pool was created with a runner botstrap timeout of 20 minutes. This timeout is important on provider where the instance may take a long time to spin up. For example, on Equinix Metal, some operating systems can take a few minutes to install and reboot. This timeout can be tweaked to a higher value to account for this.
The pool was created with the `--enabled` flag set to `false`, so the pool won't create any runners yet:
```bash
ubuntu@garm:~$ garm-cli runner list 9daa34aa-a08a-4f29-a782-f54950d8521a
In order to delete a pool, you must first make sure there are no runners in the pool. To ensure this, we can first disable the pool, to make sure no new runners are created, remove the runners or allow them to be user, then we can delete the pool.
To disable a pool, you can use the following command:
```bash
ubuntu@garm:~$ garm-cli pool update 9daa34aa-a08a-4f29-a782-f54950d8521a --enabled=false
You can update a pool by using the `garm-cli pool update` command. Nearly every aspect of a pool can be updated after it has been created. To demonstrate the command, we can enable the pool we created earlier:
```bash
ubuntu@garm:~$ garm-cli pool update 9daa34aa-a08a-4f29-a782-f54950d8521a --enabled=true
You can delete a runner by running the following command:
```bash
garm-cli runner rm garm-BFrp51VoVBCO
```
Only idle runners can be removed. If a runner is executing a job, it cannot be removed. However, a runner that is currently running a job, will be removed anyway when that job finishes. You can wait for the job to finish or you can cancel the job from the github workflow page.
In some cases, providers may error out when creating or deleting a runner. This can happen if the provider is misconfigured. To avoid situations in which GARM gets deadlocked trying to remove a runner from a provider that is in err, we can forcefully remove a runner. The `--force` flag will make GARM ignore any error returned by the provider when attempting to delete an instance:
Awesome! We've covered all the major parts of using GARM. This is all you need to have your workflows run on your self-hosted runners. Of course, each provider may have its own particularities, config options, extra specs and caveats (all of which should be documented in the provider README), but once added to the GARM config, creating a pool should be the same.
## The debug-log command
GARM outputs logs to standard out, log files and optionally to a websocket for easy debugging. This is just a convenience feature that allows you to stream logs to your terminal without having to log into the server. It's disabled by default, but if you enable it, you'll be able to run:
This will bring a real-time log to your terminal. While this feature should be fairly secure, I encourage you to only expose it within networks you know are secure. This can be done by configuring a reverse proxy in front of GARM that only allows connections to the websocket endpoint from certain locations.
Starting with GARM v0.1.5 a new command has been added to the CLI that consumes database events recorded by GARM. Whenever something is updated in the database, a new event is generated. These events are generated by the database watcher and are also exported via a websocket endpoint. This websocket endpoint is meant to be consumed by applications that wish to integrate GARM and want to avoid having to poll the API.
This command is not meant to be used to integrate GARM events, it is mearly a debug tool that allows you to see what events are being generated by GARM. To use it, you can run:
The payloads that get sent to your terminal are described in the [events](/doc/events.md) section, but the short description is that you get the operation type (create, update, delete), the entity type (instance, pool, repo, etc) and the json payload as you normaly would when you fetch them through the API. Sensitive info like tokens or passwords are never returned.
GARM will record any job that comes in and for which we have a pool configured. If we don't have a pool for a particular job, then that job is ignored. There is no point in recording jobs that we can't do anything about. It would just bloat the database for no reason.
To view existing jobs, run the following command:
```bash
garm-cli job list
```
If you've just set up GARM and have not yet created a pool or triggered a job, this will be empty. If you've configured everything and still don't receive jobs, you'll need to make sure that your URLs (discussed at the begining of this article), are correct. GitHub needs to be able to reach the webhook URL that our GARM instance listens on.