Restructure docs
Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This commit is contained in:
parent
6640599584
commit
1a54bcfc35
15 changed files with 787 additions and 509 deletions
40
README.md
40
README.md
|
|
@ -2,16 +2,38 @@
|
||||||
|
|
||||||
[](https://github.com/cloudbase/garm/actions/workflows/go-tests.yml)
|
[](https://github.com/cloudbase/garm/actions/workflows/go-tests.yml)
|
||||||
|
|
||||||
|
<!-- TOC -->
|
||||||
|
|
||||||
|
- [About GARM](#about-garm)
|
||||||
|
- [Join us on slack](#join-us-on-slack)
|
||||||
|
- [Installing](#installing)
|
||||||
|
- [Quickstart](#quickstart)
|
||||||
|
- [Installing on Kubernetes](#installing-on-kubernetes)
|
||||||
|
- [Using GARM](#using-garm)
|
||||||
|
- [Supported providers](#supported-providers)
|
||||||
|
- [Installing external providers](#installing-external-providers)
|
||||||
|
- [Optimizing your runners](#optimizing-your-runners)
|
||||||
|
- [Write your own provider](#write-your-own-provider)
|
||||||
|
|
||||||
|
<!-- /TOC -->
|
||||||
|
|
||||||
|
## About GARM
|
||||||
|
|
||||||
Welcome to GARM!
|
Welcome to GARM!
|
||||||
|
|
||||||
GARM enables you to create and automatically maintain pools of [self-hosted GitHub runners](https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners), with auto-scaling that can be used inside your github workflow runs.
|
GARM enables you to create and automatically maintain pools of [self-hosted GitHub runners](https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners), with auto-scaling that can be used inside your github workflow runs.
|
||||||
|
|
||||||
The goal of ```GARM``` is to be simple to set up, simple to configure and simple to use. The server itself is a single binary that can run on any GNU/Linux machine without any other requirements other than the providers you want to enable in your setup. It is intended to be easy to deploy in any environment and can create runners in virtually any system you can write a provider for. There is no complicated setup process and no extremely complex concepts to understand. Once set up, it's meant to stay out of your way.
|
The goal of ```GARM``` is to be simple to set up, simple to configure and simple to use. The server itself is a single binary that can run on any GNU/Linux machine without any other requirements other than the providers you want to enable in your setup. It is intended to be easy to deploy in any environment and can create runners in virtually any system you can write a provider for. There is no complicated setup process and no extremely complex concepts to understand. Once set up, it's meant to stay out of your way.
|
||||||
|
|
||||||
GARM supports creating pools in either GitHub itself or in your own deployment of [GitHub Enterprise Server](https://docs.github.com/en/enterprise-server@3.5/admin/overview/about-github-enterprise-server). For instructions on how to use ```GARM``` with GHE, see the [credentials](/doc/github_credentials.md) section of the documentation.
|
GARM supports creating pools in either GitHub itself or in your own deployment of [GitHub Enterprise Server](https://docs.github.com/en/enterprise-server@3.10/admin/overview/about-github-enterprise-server). For instructions on how to use ```GARM``` with GHE, see the [credentials](/doc/github_credentials.md) section of the documentation.
|
||||||
|
|
||||||
Through the use of providers, `GARM` can create runners in a variety of environments using the same `GARM` instance. Whether you want to create pools of runners in your OpenStack cloud, your Azure cloud or your Kubernetes cluster, that is easily achieved by just installing the appropriate providers, configuring them in `GARM` and creating pools that use them. You can create zero-runner pools for instances with high costs (large VMs, GPU enabled instances, etc) and have them spin up on demand, or you can create large pools of eagerly creaated k8s backed runners that can be used for your CI/CD pipelines at a moment's notice. You can mix them up and create pools in any combination of providers or resource allocations you want.
|
Through the use of providers, `GARM` can create runners in a variety of environments using the same `GARM` instance. Whether you want to create pools of runners in your OpenStack cloud, your Azure cloud or your Kubernetes cluster, that is easily achieved by just installing the appropriate providers, configuring them in `GARM` and creating pools that use them. You can create zero-runner pools for instances with high costs (large VMs, GPU enabled instances, etc) and have them spin up on demand, or you can create large pools of eagerly creaated k8s backed runners that can be used for your CI/CD pipelines at a moment's notice. You can mix them up and create pools in any combination of providers or resource allocations you want.
|
||||||
|
|
||||||
|
Here is a brief architectural diagram of how GARM reacts to workflows triggered in GitHub (click the image to see a larger version):
|
||||||
|
|
||||||
|

|
||||||
|

|
||||||
|
|
||||||
:warning: **Important note**: The README and documentation in the `main` branch are relevant to the not yet released code that is present in `main`. Following the documentation from the `main` branch for a stable release of GARM, may lead to errors. To view the documentation for the latest stable release, please switch to the appropriate tag. For information about setting up `v0.1.5`, please refer to the [v0.1.5 tag](https://github.com/cloudbase/garm/tree/v0.1.5).
|
:warning: **Important note**: The README and documentation in the `main` branch are relevant to the not yet released code that is present in `main`. Following the documentation from the `main` branch for a stable release of GARM, may lead to errors. To view the documentation for the latest stable release, please switch to the appropriate tag. For information about setting up `v0.1.5`, please refer to the [v0.1.5 tag](https://github.com/cloudbase/garm/tree/v0.1.5).
|
||||||
|
|
||||||
## Join us on slack
|
## Join us on slack
|
||||||
|
|
@ -22,26 +44,14 @@ Whether you're running into issues or just want to drop by and say "hi", feel fr
|
||||||
|
|
||||||
## Installing
|
## Installing
|
||||||
|
|
||||||
### On virtual or physical machines
|
### Quickstart
|
||||||
|
|
||||||
Check out the [quickstart](/doc/quickstart.md) document for instructions on how to install ```GARM```. If you'd like to build from source, check out the [building from source](/doc/building_from_source.md) document.
|
Check out the [quickstart](/doc/quickstart.md) document for instructions on how to install ```GARM```. If you'd like to build from source, check out the [building from source](/doc/building_from_source.md) document.
|
||||||
|
|
||||||
### On Kubernetes
|
### Installing on Kubernetes
|
||||||
|
|
||||||
Thanks to the efforts of the amazing folks at [@mercedes-benz](https://github.com/mercedes-benz/), GARM can now be integrated into k8s via their operator. Check out the [GARM operator](https://github.com/mercedes-benz/garm-operator/) for more details.
|
Thanks to the efforts of the amazing folks at [@mercedes-benz](https://github.com/mercedes-benz/), GARM can now be integrated into k8s via their operator. Check out the [GARM operator](https://github.com/mercedes-benz/garm-operator/) for more details.
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
The ```GARM``` configuration is a simple ```toml```. The sample config file in [the testdata folder](/testdata/config.toml) is fairly well commented and should be enough to get you started. The configuration file is split into several sections, each of which is documented in its own page. The sections are:
|
|
||||||
|
|
||||||
* [The default section](/doc/config_default.md)
|
|
||||||
* [Logging](/doc/config_logging.md)
|
|
||||||
* [Database](/doc/database.md)
|
|
||||||
* [Providers](/doc/providers.md)
|
|
||||||
* [Metrics](/doc/config_metrics.md)
|
|
||||||
* [JWT authentication](/doc/config_jwt_auth.md)
|
|
||||||
* [API server](/doc/config_api_server.md)
|
|
||||||
|
|
||||||
## Using GARM
|
## Using GARM
|
||||||
|
|
||||||
GARM is designed with simplicity in mind. At least we try to keep it as simple as possible. We're aware that adding a new tool in your workflow can be painful, especially when you already have to deal with so many. The cognitive load for OPS has reached a level where it feels overwhelming at times to even wrap your head around a new tool. As such, we believe that tools should be simple, should take no more than a few hours to understand and set up and if you absolutely need to interact with the tool, it should be as intuitive as possible. Although we try our best to make this happen, we're aware that GARM has some rough edges, especially for new users. If you encounter issues or feel like the setup process was too complicated, please let us know. We're always looking to improve the user experience.
|
GARM is designed with simplicity in mind. At least we try to keep it as simple as possible. We're aware that adding a new tool in your workflow can be painful, especially when you already have to deal with so many. The cognitive load for OPS has reached a level where it feels overwhelming at times to even wrap your head around a new tool. As such, we believe that tools should be simple, should take no more than a few hours to understand and set up and if you absolutely need to interact with the tool, it should be as intuitive as possible. Although we try our best to make this happen, we're aware that GARM has some rough edges, especially for new users. If you encounter issues or feel like the setup process was too complicated, please let us know. We're always looking to improve the user experience.
|
||||||
|
|
|
||||||
|
|
@ -5,8 +5,8 @@ import (
|
||||||
)
|
)
|
||||||
|
|
||||||
type Filter struct {
|
type Filter struct {
|
||||||
Operations []common.OperationType `json:"operations"`
|
Operations []common.OperationType `json:"operations,omitempty" jsonschema:"title=operations,description=A list of operations to filter on,default=[],enum=create,enum=update,enum=delete"`
|
||||||
EntityType common.DatabaseEntityType `json:"entity-type"`
|
EntityType common.DatabaseEntityType `json:"entity-type,omitempty" jsonschema:"title=entity type,description=The type of entity to filter on,default=repository,enum=repository,enum=organization,enum=enterprise,enum=pool,enum=user,enum=instance,enum=job,enum=controller,enum=github_credentials,enum=github_endpoint"`
|
||||||
}
|
}
|
||||||
|
|
||||||
func (f Filter) Validate() error {
|
func (f Filter) Validate() error {
|
||||||
|
|
@ -30,8 +30,8 @@ func (f Filter) Validate() error {
|
||||||
}
|
}
|
||||||
|
|
||||||
type Options struct {
|
type Options struct {
|
||||||
SendEverything bool `json:"send-everything"`
|
SendEverything bool `json:"send-everything,omitempty" jsonschema:"title=send everything, description=send all events,default=false"`
|
||||||
Filters []Filter `json:"filters"`
|
Filters []Filter `json:"filters,omitempty" jsonschema:"title=filters,description=A list of filters to apply to the events. This is ignored when send-everything is true,default=[]"`
|
||||||
}
|
}
|
||||||
|
|
||||||
func (o Options) Validate() error {
|
func (o Options) Validate() error {
|
||||||
|
|
|
||||||
480
doc/config.md
Normal file
480
doc/config.md
Normal file
|
|
@ -0,0 +1,480 @@
|
||||||
|
# Configuration
|
||||||
|
|
||||||
|
The ```GARM``` configuration is a simple ```toml```. The sample config file in [the testdata folder](/testdata/config.toml) is fairly well commented and should be enough to get you started. The configuration file is split into several sections, each of which is documented in its own page. The sections are:
|
||||||
|
|
||||||
|
<!-- TOC -->
|
||||||
|
|
||||||
|
- [Configuration](#configuration)
|
||||||
|
- [The default config section](#the-default-config-section)
|
||||||
|
- [The callback_url option](#the-callback_url-option)
|
||||||
|
- [The metadata_url option](#the-metadata_url-option)
|
||||||
|
- [The debug_server option](#the-debug_server-option)
|
||||||
|
- [The log_file option](#the-log_file-option)
|
||||||
|
- [Rotating log files](#rotating-log-files)
|
||||||
|
- [The enable_log_streamer option](#the-enable_log_streamer-option)
|
||||||
|
- [The logging section](#the-logging-section)
|
||||||
|
- [Database configuration](#database-configuration)
|
||||||
|
- [Provider configuration](#provider-configuration)
|
||||||
|
- [Providers](#providers)
|
||||||
|
- [Available external providers](#available-external-providers)
|
||||||
|
- [The metrics section](#the-metrics-section)
|
||||||
|
- [Common metrics](#common-metrics)
|
||||||
|
- [Enterprise metrics](#enterprise-metrics)
|
||||||
|
- [Organization metrics](#organization-metrics)
|
||||||
|
- [Repository metrics](#repository-metrics)
|
||||||
|
- [Provider metrics](#provider-metrics)
|
||||||
|
- [Pool metrics](#pool-metrics)
|
||||||
|
- [Runner metrics](#runner-metrics)
|
||||||
|
- [Github metrics](#github-metrics)
|
||||||
|
- [Enabling metrics](#enabling-metrics)
|
||||||
|
- [Configuring prometheus](#configuring-prometheus)
|
||||||
|
- [The JWT authentication config section](#the-jwt-authentication-config-section)
|
||||||
|
- [The API server config section](#the-api-server-config-section)
|
||||||
|
|
||||||
|
<!-- /TOC -->
|
||||||
|
|
||||||
|
## The default config section
|
||||||
|
|
||||||
|
The `default` config section holds configuration options that don't need a category of their own, but are essential to the operation of the service. In this section we will detail each of the options available in the `default` section.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[default]
|
||||||
|
# Uncomment this line if you'd like to log to a file instead of standard output.
|
||||||
|
# log_file = "/tmp/runner-manager.log"
|
||||||
|
|
||||||
|
# Enable streaming logs via web sockets. Use garm-cli debug-log.
|
||||||
|
enable_log_streamer = false
|
||||||
|
|
||||||
|
# Enable the golang debug server. See the documentation in the "doc" folder for more information.
|
||||||
|
debug_server = false
|
||||||
|
```
|
||||||
|
|
||||||
|
### The callback_url option
|
||||||
|
|
||||||
|
Your runners will call back home with status updates as they install. Once they are set up, they will also send the GitHub agent ID they were allocated. You will need to configure the ```callback_url``` option in the ```garm``` server config. This URL needs to point to the following API endpoint:
|
||||||
|
|
||||||
|
```txt
|
||||||
|
POST /api/v1/callbacks/status
|
||||||
|
```
|
||||||
|
|
||||||
|
Example of a runner sending status updates:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
garm-cli runner show garm-DvxiVAlfHeE7
|
||||||
|
+-----------------+------------------------------------------------------------------------------------+
|
||||||
|
| FIELD | VALUE |
|
||||||
|
+-----------------+------------------------------------------------------------------------------------+
|
||||||
|
| ID | 16b96ba2-d406-45b8-ab66-b70be6237b4e |
|
||||||
|
| Provider ID | garm-DvxiVAlfHeE7 |
|
||||||
|
| Name | garm-DvxiVAlfHeE7 |
|
||||||
|
| OS Type | linux |
|
||||||
|
| OS Architecture | amd64 |
|
||||||
|
| OS Name | ubuntu |
|
||||||
|
| OS Version | jammy |
|
||||||
|
| Status | running |
|
||||||
|
| Runner Status | idle |
|
||||||
|
| Pool ID | 8ec34c1f-b053-4a5d-80d6-40afdfb389f9 |
|
||||||
|
| Addresses | 10.198.117.120 |
|
||||||
|
| Status Updates | 2023-07-08T06:26:46: runner registration token was retrieved |
|
||||||
|
| | 2023-07-08T06:26:46: using cached runner found in /opt/cache/actions-runner/latest |
|
||||||
|
| | 2023-07-08T06:26:50: configuring runner |
|
||||||
|
| | 2023-07-08T06:26:56: runner successfully configured after 1 attempt(s) |
|
||||||
|
| | 2023-07-08T06:26:56: installing runner service |
|
||||||
|
| | 2023-07-08T06:26:56: starting service |
|
||||||
|
| | 2023-07-08T06:26:57: runner successfully installed |
|
||||||
|
+-----------------+------------------------------------------------------------------------------------+
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
This URL must be set and must be accessible by the instance. If you wish to restrict access to it, a reverse proxy can be configured to accept requests only from networks in which the runners ```garm``` manages will be spun up. This URL doesn't need to be globally accessible, it just needs to be accessible by the instances.
|
||||||
|
|
||||||
|
For example, in a scenario where you expose the API endpoint directly, this setting could look like the following:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
callback_url = "https://garm.example.com/api/v1/callbacks"
|
||||||
|
```
|
||||||
|
|
||||||
|
Authentication is done using a short-lived JWT token, that gets generated for a particular instance that we are spinning up. That JWT token grants access to the instance to only update it's own status and to fetch metadata for itself. No other API endpoints will work with that JWT token. The validity of the token is equal to the pool bootstrap timeout value (default 20 minutes) plus the garm polling interval (5 minutes).
|
||||||
|
|
||||||
|
There is a sample ```nginx``` config [in the testdata folder](/testdata/nginx-server.conf). Feel free to customize it whichever way you see fit.
|
||||||
|
|
||||||
|
### The metadata_url option
|
||||||
|
|
||||||
|
The metadata URL is the base URL for any information an instance may need to fetch in order to finish setting itself up. As this URL may be placed behind a reverse proxy, you'll need to configure it in the ```garm``` config file. Ultimately this URL will need to point to the following ```garm``` API endpoint:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
GET /api/v1/metadata
|
||||||
|
```
|
||||||
|
|
||||||
|
This URL needs to be accessible only by the instances ```garm``` sets up. This URL will not be used by anyone else. To configure it in ```garm``` add the following line in the ```[default]``` section of your ```garm``` config:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
metadata_url = "https://garm.example.com/api/v1/metadata"
|
||||||
|
```
|
||||||
|
|
||||||
|
### The debug_server option
|
||||||
|
|
||||||
|
GARM can optionally enable the golang profiling server. This is useful if you suspect garm may be have a bottleneck in any way. To enable the profiling server, add the following section to the garm config:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[default]
|
||||||
|
|
||||||
|
debug_server = true
|
||||||
|
```
|
||||||
|
|
||||||
|
And restart garm. You can then use the following command to start profiling:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
go tool pprof http://127.0.0.1:9997/debug/pprof/profile?seconds=120
|
||||||
|
```
|
||||||
|
|
||||||
|
Important note on profiling when behind a reverse proxy. The above command will hang for a fairly long time. Most reverse proxies will timeout after about 60 seconds. To avoid this, you should only profile on localhost by connecting directly to garm.
|
||||||
|
|
||||||
|
It's also advisable to exclude the debug server URLs from your reverse proxy and only make them available locally.
|
||||||
|
|
||||||
|
Now that the debug server is enabled, here is a blog post on how to profile golang applications: https://blog.golang.org/profiling-go-programs
|
||||||
|
|
||||||
|
|
||||||
|
### The log_file option
|
||||||
|
|
||||||
|
By default, GARM logs everything to standard output.
|
||||||
|
|
||||||
|
You can optionally log to file by adding the following to your config file:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[default]
|
||||||
|
# Use this if you'd like to log to a file instead of standard output.
|
||||||
|
log_file = "/tmp/runner-manager.log"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Rotating log files
|
||||||
|
|
||||||
|
GARM automatically rotates the log if it reaches 500 MB in size or 28 days, whichever comes first.
|
||||||
|
|
||||||
|
However, if you want to manually rotate the log file, you can send a `SIGHUP` signal to the GARM process.
|
||||||
|
|
||||||
|
You can add the following to your systemd unit file to enable `reload`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[Service]
|
||||||
|
ExecReload=/bin/kill -HUP $MAINPID
|
||||||
|
```
|
||||||
|
|
||||||
|
Then you can simply:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
systemctl reload garm
|
||||||
|
```
|
||||||
|
|
||||||
|
### The enable_log_streamer option
|
||||||
|
|
||||||
|
This option allows you to stream garm logs directly to your terminal. Set this option to true, then you can use the following command to stream logs:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
garm-cli debug-log
|
||||||
|
```
|
||||||
|
|
||||||
|
An important note on enabling this option when behind a reverse proxy. The log streamer uses websockets to stream logs to you. You will need to configure your reverse proxy to allow websocket connections. If you're using nginx, you will need to add the following to your nginx `server` config:
|
||||||
|
|
||||||
|
```nginx
|
||||||
|
location /api/v1/ws {
|
||||||
|
proxy_pass http://garm_backend;
|
||||||
|
proxy_http_version 1.1;
|
||||||
|
proxy_set_header Upgrade $http_upgrade;
|
||||||
|
proxy_set_header Connection "Upgrade";
|
||||||
|
proxy_set_header Host $host;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## The logging section
|
||||||
|
|
||||||
|
GARM has switched to the `slog` package for logging, adding structured logging. As such, we added a dedicated `logging` section to the config to tweak the logging settings. We moved the `enable_log_streamer` and the `log_file` options from the `default` section to the `logging` section. They are still available in the `default` section for backwards compatibility, but they are deprecated and will be removed in a future release.
|
||||||
|
|
||||||
|
An example of the new `logging` section:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[logging]
|
||||||
|
# Uncomment this line if you'd like to log to a file instead of standard output.
|
||||||
|
# log_file = "/tmp/runner-manager.log"
|
||||||
|
|
||||||
|
# enable_log_streamer enables streaming the logs over websockets
|
||||||
|
enable_log_streamer = true
|
||||||
|
# log_format is the output format of the logs. GARM uses structured logging and can
|
||||||
|
# output as "text" or "json"
|
||||||
|
log_format = "text"
|
||||||
|
# log_level is the logging level GARM will output. Available log levels are:
|
||||||
|
# * debug
|
||||||
|
# * info
|
||||||
|
# * warn
|
||||||
|
# * error
|
||||||
|
log_level = "debug"
|
||||||
|
# log_source will output information about the function that generated the log line.
|
||||||
|
log_source = false
|
||||||
|
```
|
||||||
|
|
||||||
|
By default GARM logs everything to standard output. You can optionally log to file by adding the `log_file` option to the `logging` section. The `enable_log_streamer` option allows you to stream GARM logs directly to your terminal. Set this option to `true`, then you can use the following command to stream logs:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
garm-cli debug-log
|
||||||
|
```
|
||||||
|
|
||||||
|
The `log_format`, `log_level` and `log_source` options allow you to tweak the logging output. The `log_format` option can be set to `text` or `json`. The `log_level` option can be set to `debug`, `info`, `warn` or `error`. The `log_source` option will output information about the function that generated the log line. All these options influence how the structured logging is output.
|
||||||
|
|
||||||
|
This will allow you to ingest GARM logs in a central location such as an ELK stack or similar.
|
||||||
|
|
||||||
|
## Database configuration
|
||||||
|
|
||||||
|
GARM currently supports SQLite3. Support for other stores will be added in the future.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[database]
|
||||||
|
# Turn on/off debugging for database queries.
|
||||||
|
debug = false
|
||||||
|
# Database backend to use. Currently supported backends are:
|
||||||
|
# * sqlite3
|
||||||
|
backend = "sqlite3"
|
||||||
|
# the passphrase option is a temporary measure by which we encrypt the webhook
|
||||||
|
# secret that gets saved to the database, using AES256. In the future, secrets
|
||||||
|
# will be saved to something like Barbican or Vault, eliminating the need for
|
||||||
|
# this. This string needs to be 32 characters in size.
|
||||||
|
passphrase = "shreotsinWadquidAitNefayctowUrph"
|
||||||
|
[database.sqlite3]
|
||||||
|
# Path on disk to the sqlite3 database file.
|
||||||
|
db_file = "/home/runner/garm.db"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Provider configuration
|
||||||
|
|
||||||
|
GARM was designed to be extensible. Providers can be written as external executables which implement the needed interface to create/delete/list compute systems that are used by ```GARM``` to create runners.
|
||||||
|
|
||||||
|
### Providers
|
||||||
|
|
||||||
|
GARM delegates the functionality needed to create the runners to external executables. These executables can be either binaries or scripts. As long as they adhere to the needed interface, they can be used to create runners in any target IaaS. You might find this behavior familiar if you've ever had to deal with installing `CNIs` in `containerd`. The principle is the same.
|
||||||
|
|
||||||
|
The configuration for an external provider is quite simple:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
# This is an example external provider. External providers are executables that
|
||||||
|
# implement the needed interface to create/delete/list compute systems that are used
|
||||||
|
# by GARM to create runners.
|
||||||
|
[[provider]]
|
||||||
|
name = "openstack_external"
|
||||||
|
description = "external openstack provider"
|
||||||
|
provider_type = "external"
|
||||||
|
[provider.external]
|
||||||
|
# config file passed to the executable via GARM_PROVIDER_CONFIG_FILE environment variable
|
||||||
|
config_file = "/etc/garm/providers.d/openstack/keystonerc"
|
||||||
|
# Absolute path to an executable that implements the provider logic. This executable can be
|
||||||
|
# anything (bash, a binary, python, etc). See documentation in this repo on how to write an
|
||||||
|
# external provider.
|
||||||
|
provider_executable = "/etc/garm/providers.d/openstack/garm-external-provider"
|
||||||
|
# This option will pass all environment variables that start with AWS_ to the provider.
|
||||||
|
# To pass in individual variables, you can add the entire name to the list.
|
||||||
|
environment_variables = ["AWS_"]
|
||||||
|
```
|
||||||
|
|
||||||
|
The external provider has three options:
|
||||||
|
|
||||||
|
* `provider_executable`
|
||||||
|
* `config_file`
|
||||||
|
* `environment_variables`
|
||||||
|
|
||||||
|
The ```provider_executable``` option is the absolute path to an executable that implements the provider logic. GARM will delegate all provider operations to this executable. This executable can be anything (bash, python, perl, go, etc). See [Writing an external provider](./external_provider.md) for more details.
|
||||||
|
|
||||||
|
The ```config_file``` option is a path on disk to an arbitrary file, that is passed to the external executable via the environment variable ```GARM_PROVIDER_CONFIG_FILE```. This file is only relevant to the external provider. GARM itself does not read it. Let's take the [OpenStack provider](https://github.com/cloudbase/garm-provider-openstack) as an example. The [config file](https://github.com/cloudbase/garm-provider-openstack/blob/ac46d4d5a542bca96cd0309c89437d3382c3ea26/testdata/config.toml) contains access information for an OpenStack cloud as well as some provider specific options like whether or not to boot from volume and which tenant network to use.
|
||||||
|
|
||||||
|
The `environment_variables` option is a list of environment variables that will be passed to the external provider. By default GARM will pass a clean env to providers, consisting only of variables that the [provider interface](./external_provider.md) expects. However, in some situations, provider may need access to certain environment variables set in the env of GARM itself. This might be needed to enable access to IAM roles (ec2) or managed identity (azure). This option takes a list of environment variables or prefixes of environment variables that will be passed to the provider. For example, if you want to pass all environment variables that start with `AWS_` to the provider, you can set this option to `["AWS_"]`.
|
||||||
|
|
||||||
|
If you want to implement an external provider, you can use this file for anything you need to pass into the binary when ```GARM``` calls it to execute a particular operation.
|
||||||
|
|
||||||
|
#### Available external providers
|
||||||
|
|
||||||
|
For non testing purposes, there are two external providers currently available:
|
||||||
|
|
||||||
|
* [OpenStack](https://github.com/cloudbase/garm-provider-openstack)
|
||||||
|
* [Azure](https://github.com/cloudbase/garm-provider-azure)
|
||||||
|
* [Kubernetes](https://github.com/mercedes-benz/garm-provider-k8s) - Thanks to the amazing folks at @mercedes-benz for sharing their awesome provider!
|
||||||
|
* [LXD](https://github.com/cloudbase/garm-provider-lxd)
|
||||||
|
* [Incus](https://github.com/cloudbase/garm-provider-incus)
|
||||||
|
* [Equinix Metal](https://github.com/cloudbase/garm-provider-equinix)
|
||||||
|
* [Amazon EC2](https://github.com/cloudbase/garm-provider-aws)
|
||||||
|
* [Google Cloud Platform (GCP)](https://github.com/cloudbase/garm-provider-gcp)
|
||||||
|
* [Oracle Cloud Infrastructure (OCI)](https://github.com/cloudbase/garm-provider-oci)
|
||||||
|
|
||||||
|
Details on how to install and configure them are available in their respective repositories.
|
||||||
|
|
||||||
|
If you wrote a provider and would like to add it to the above list, feel free to open a PR.
|
||||||
|
|
||||||
|
|
||||||
|
## The metrics section
|
||||||
|
|
||||||
|
This is one of the features in GARM that I really love having. For one thing, it's community contributed and for another, it really adds value to the project. It allows us to create some pretty nice visualizations of what is happening with GARM.
|
||||||
|
|
||||||
|
### Common metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|--------------------------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
|
||||||
|
| `garm_health` | Gauge | `controller_id`=<controller id> <br>`callback_url`=<callback url> <br>`controller_webhook_url`=<controller webhook url> <br>`metadata_url`=<metadata url> <br>`webhook_url`=<webhook url> <br>`name`=<hostname> | This is a gauge that is set to 1 if GARM is healthy and 0 if it is not. This is useful for alerting. |
|
||||||
|
| `garm_webhooks_received` | Counter | `valid`=<valid request> <br>`reason`=<reason for invalid requests> | This is a counter that increments every time GARM receives a webhook from GitHub. |
|
||||||
|
|
||||||
|
### Enterprise metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|---------------------------------------|-------|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|
|
||||||
|
| `garm_enterprise_info` | Gauge | `id`=<enterprise id> <br>`name`=<enterprise name> | This is a gauge that is set to 1 and expose enterprise information |
|
||||||
|
| `garm_enterprise_pool_manager_status` | Gauge | `id`=<enterprise id> <br>`name`=<enterprise name> <br>`running`=<true\|false> | This is a gauge that is set to 1 if the enterprise pool manager is running and set to 0 if not |
|
||||||
|
|
||||||
|
### Organization metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|-----------------------------------------|-------|-----------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
|
||||||
|
| `garm_organization_info` | Gauge | `id`=<organization id> <br>`name`=<organization name> | This is a gauge that is set to 1 and expose organization information |
|
||||||
|
| `garm_organization_pool_manager_status` | Gauge | `id`=<organization id> <br>`name`=<organization name> <br>`running`=<true\|false> | This is a gauge that is set to 1 if the organization pool manager is running and set to 0 if not |
|
||||||
|
|
||||||
|
### Repository metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|---------------------------------------|-------|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|
|
||||||
|
| `garm_repository_info` | Gauge | `id`=<repository id> <br>`name`=<repository name> | This is a gauge that is set to 1 and expose repository information |
|
||||||
|
| `garm_repository_pool_manager_status` | Gauge | `id`=<repository id> <br>`name`=<repository name> <br>`running`=<true\|false> | This is a gauge that is set to 1 if the repository pool manager is running and set to 0 if not |
|
||||||
|
|
||||||
|
### Provider metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|----------------------|-------|-------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------|
|
||||||
|
| `garm_provider_info` | Gauge | `description`=<provider description> <br>`name`=<provider name> <br>`type`=<internal\|external> | This is a gauge that is set to 1 and expose provider information |
|
||||||
|
|
||||||
|
### Pool metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|-------------------------------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|
|
||||||
|
| `garm_pool_info` | Gauge | `flavor`=<flavor> <br>`id`=<pool id> <br>`image`=<image name> <br>`os_arch`=<defined OS arch> <br>`os_type`=<defined OS name> <br>`pool_owner`=<owner name> <br>`pool_type`=<repository\|organization\|enterprise> <br>`prefix`=<prefix> <br>`provider`=<provider name> <br>`tags`=<concatenated list of pool tags> <br> | This is a gauge that is set to 1 and expose pool information |
|
||||||
|
| `garm_pool_status` | Gauge | `enabled`=<true\|false> <br>`id`=<pool id> | This is a gauge that is set to 1 if the pool is enabled and set to 0 if not |
|
||||||
|
| `garm_pool_bootstrap_timeout` | Gauge | `id`=<pool id> | This is a gauge that is set to the pool bootstrap timeout |
|
||||||
|
| `garm_pool_max_runners` | Gauge | `id`=<pool id> | This is a gauge that is set to the pool max runners |
|
||||||
|
| `garm_pool_min_idle_runners` | Gauge | `id`=<pool id> | This is a gauge that is set to the pool min idle runners |
|
||||||
|
|
||||||
|
### Runner metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|--------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
|
||||||
|
| `garm_runner_status` | Gauge | `name`=<runner name> <br>`pool_owner`=<owner name> <br>`pool_type`=<repository\|organization\|enterprise> <br>`provider`=<provider name> <br>`runner_status`=<running\|stopped\|error\|pending_delete\|deleting\|pending_create\|creating\|unknown> <br>`status`=<idle\|pending\|terminated\|installing\|failed\|active> <br> | This is a gauge value that gives us details about the runners garm spawns |
|
||||||
|
| `garm_runner_operations_total` | Counter | `provider`=<provider name> <br>`operation`=<CreateInstance\|DeleteInstance\|GetInstance\|ListInstances\|RemoveAllInstances\|Start\Stop> | This is a counter that increments every time a runner operation is performed |
|
||||||
|
| `garm_runner_errors_total` | Counter | `provider`=<provider name> <br>`operation`=<CreateInstance\|DeleteInstance\|GetInstance\|ListInstances\|RemoveAllInstances\|Start\Stop> | This is a counter that increments every time a runner operation errored |
|
||||||
|
|
||||||
|
### Github metrics
|
||||||
|
|
||||||
|
| Metric name | Type | Labels | Description |
|
||||||
|
|--------------------------------|---------|------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
|
||||||
|
| `garm_github_operations_total` | Counter | `operation`=<ListRunners\|CreateRegistrationToken\|...> <br>`scope`=<Organization\|Repository\|Enterprise> | This is a counter that increments every time a github operation is performed |
|
||||||
|
| `garm_github_errors_total` | Counter | `operation`=<ListRunners\|CreateRegistrationToken\|...> <br>`scope`=<Organization\|Repository\|Enterprise> | This is a counter that increments every time a github operation errored |
|
||||||
|
|
||||||
|
### Enabling metrics
|
||||||
|
|
||||||
|
Metrics are disabled by default. To enable them, add the following to your config file:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[metrics]
|
||||||
|
|
||||||
|
# Toggle to disable authentication (not recommended) on the metrics endpoint.
|
||||||
|
# If you do disable authentication, I encourage you to put a reverse proxy in front
|
||||||
|
# of garm and limit which systems can access that particular endpoint. Ideally, you
|
||||||
|
# would enable some kind of authentication using the reverse proxy, if the built-in auth
|
||||||
|
# is not sufficient for your needs.
|
||||||
|
#
|
||||||
|
# Default: false
|
||||||
|
disable_auth = true
|
||||||
|
|
||||||
|
# Toggle metrics. If set to false, the API endpoint for metrics collection will
|
||||||
|
# be disabled.
|
||||||
|
#
|
||||||
|
# Default: false
|
||||||
|
enable = true
|
||||||
|
|
||||||
|
# period is the time interval when the /metrics endpoint will update internal metrics about
|
||||||
|
# controller specific objects (e.g. runners, pools, etc.)
|
||||||
|
#
|
||||||
|
# Default: "60s"
|
||||||
|
period = "30s"
|
||||||
|
```
|
||||||
|
|
||||||
|
You can choose to disable authentication if you wish, however it's not terribly difficult to set up, so I generally advise against disabling it.
|
||||||
|
|
||||||
|
### Configuring prometheus
|
||||||
|
|
||||||
|
The following section assumes that your garm instance is running at `garm.example.com` and has TLS enabled.
|
||||||
|
|
||||||
|
First, generate a new JWT token valid only for the metrics endpoint:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
garm-cli metrics-token create
|
||||||
|
```
|
||||||
|
|
||||||
|
Note: The token validity is equal to the TTL you set in the [JWT config section](#the-jwt-authentication-config-section).
|
||||||
|
|
||||||
|
Copy the resulting token, and add it to your prometheus config file. The following is an example of how to add garm as a target in your prometheus config file:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
scrape_configs:
|
||||||
|
- job_name: "garm"
|
||||||
|
# Connect over https. If you don't have TLS enabled, change this to http.
|
||||||
|
scheme: https
|
||||||
|
static_configs:
|
||||||
|
- targets: ["garm.example.com"]
|
||||||
|
authorization:
|
||||||
|
credentials: "superSecretTokenYouGeneratedEarlier"
|
||||||
|
```
|
||||||
|
|
||||||
|
## The JWT authentication config section
|
||||||
|
|
||||||
|
This section configures the JWT authentication used by the API server. GARM is currently a single user system and that user has the right to do anything and everything GARM is capable of. As a result, the JWT auth we have does not include a refresh token. The token is valid for the duration of the time to live (TTL) set in the config file. Once the token expires, you will need to log in again.
|
||||||
|
|
||||||
|
It is recommended that the secret be a long, randomly generated string. Changing the secret at any time will invalidate all existing tokens.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[jwt_auth]
|
||||||
|
# A JWT token secret used to sign tokens. Obviously, this needs to be changed :).
|
||||||
|
secret = ")9gk_4A6KrXz9D2u`0@MPea*sd6W`%@5MAWpWWJ3P3EqW~qB!!(Vd$FhNc*eU4vG"
|
||||||
|
|
||||||
|
# Time to live for tokens. Both the instances and you will use JWT tokens to
|
||||||
|
# authenticate against the API. However, this TTL is applied only to tokens you
|
||||||
|
# get when logging into the API. The tokens issued to the instances we manage,
|
||||||
|
# have a TTL based on the runner bootstrap timeout set on each pool. The minimum
|
||||||
|
# TTL for this token is 24h.
|
||||||
|
time_to_live = "8760h"
|
||||||
|
```
|
||||||
|
|
||||||
|
## The API server config section
|
||||||
|
|
||||||
|
This section allows you to configure the GARM API server. The API server is responsible for serving all the API endpoints used by the `garm-cli`, the runners that phone home their status and by GitHub when it sends us webhooks.
|
||||||
|
|
||||||
|
The config options are fairly straight forward.
|
||||||
|
|
||||||
|
```toml
|
||||||
|
[apiserver]
|
||||||
|
# Bind the API to this IP
|
||||||
|
bind = "0.0.0.0"
|
||||||
|
# Bind the API to this port
|
||||||
|
port = 9997
|
||||||
|
# Whether or not to set up TLS for the API endpoint. If this is set to true,
|
||||||
|
# you must have a valid apiserver.tls section.
|
||||||
|
use_tls = false
|
||||||
|
# Set a list of allowed origins
|
||||||
|
# By default, if this option is omitted or empty, we will check
|
||||||
|
# only that the origin is the same as the originating server.
|
||||||
|
# A literal of "*" will allow any origin
|
||||||
|
cors_origins = ["*"]
|
||||||
|
[apiserver.tls]
|
||||||
|
# Path on disk to a x509 certificate bundle.
|
||||||
|
# NOTE: if your certificate is signed by an intermediary CA, this file
|
||||||
|
# must contain the entire certificate bundle needed for clients to validate
|
||||||
|
# the certificate. This usually means concatenating the certificate and the
|
||||||
|
# CA bundle you received.
|
||||||
|
certificate = ""
|
||||||
|
# The path on disk to the corresponding private key for the certificate.
|
||||||
|
key = ""
|
||||||
|
```
|
||||||
|
|
||||||
|
The GARM API server has the option to enable TLS, but I suggest you use a reverse proxy and enable TLS termination in that reverse proxy. There is an `nginx` sample in this repository with TLS termination enabled.
|
||||||
|
|
||||||
|
You can of course enable TLS in both garm and the reverse proxy. The choice is yours.
|
||||||
|
|
@ -1,34 +0,0 @@
|
||||||
# The API server config section
|
|
||||||
|
|
||||||
This section allows you to configure the GARM API server. The API server is responsible for serving all the API endpoints used by the `garm-cli`, the runners that phone home their status and by GitHub when it sends us webhooks.
|
|
||||||
|
|
||||||
The config options are fairly straight forward.
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[apiserver]
|
|
||||||
# Bind the API to this IP
|
|
||||||
bind = "0.0.0.0"
|
|
||||||
# Bind the API to this port
|
|
||||||
port = 9997
|
|
||||||
# Whether or not to set up TLS for the API endpoint. If this is set to true,
|
|
||||||
# you must have a valid apiserver.tls section.
|
|
||||||
use_tls = false
|
|
||||||
# Set a list of allowed origins
|
|
||||||
# By default, if this option is omitted or empty, we will check
|
|
||||||
# only that the origin is the same as the originating server.
|
|
||||||
# A literal of "*" will allow any origin
|
|
||||||
cors_origins = ["*"]
|
|
||||||
[apiserver.tls]
|
|
||||||
# Path on disk to a x509 certificate bundle.
|
|
||||||
# NOTE: if your certificate is signed by an intermediary CA, this file
|
|
||||||
# must contain the entire certificate bundle needed for clients to validate
|
|
||||||
# the certificate. This usually means concatenating the certificate and the
|
|
||||||
# CA bundle you received.
|
|
||||||
certificate = ""
|
|
||||||
# The path on disk to the corresponding private key for the certificate.
|
|
||||||
key = ""
|
|
||||||
```
|
|
||||||
|
|
||||||
The GARM API server has the option to enable TLS, but I suggest you use a reverse proxy and enable TLS termination in that reverse proxy. There is an `nginx` sample in this repository with TLS termination enabled.
|
|
||||||
|
|
||||||
You can of course enable TLS in both garm and the reverse proxy. The choice is yours.
|
|
||||||
|
|
@ -1,152 +0,0 @@
|
||||||
# The default config section
|
|
||||||
|
|
||||||
The `default` config section holds configuration options that don't need a category of their own, but are essential to the operation of the service. In this section we will detail each of the options available in the `default` section.
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[default]
|
|
||||||
# Uncomment this line if you'd like to log to a file instead of standard output.
|
|
||||||
# log_file = "/tmp/runner-manager.log"
|
|
||||||
|
|
||||||
# Enable streaming logs via web sockets. Use garm-cli debug-log.
|
|
||||||
enable_log_streamer = false
|
|
||||||
|
|
||||||
# Enable the golang debug server. See the documentation in the "doc" folder for more information.
|
|
||||||
debug_server = false
|
|
||||||
```
|
|
||||||
|
|
||||||
## The callback_url option
|
|
||||||
|
|
||||||
Your runners will call back home with status updates as they install. Once they are set up, they will also send the GitHub agent ID they were allocated. You will need to configure the ```callback_url``` option in the ```garm``` server config. This URL needs to point to the following API endpoint:
|
|
||||||
|
|
||||||
```txt
|
|
||||||
POST /api/v1/callbacks/status
|
|
||||||
```
|
|
||||||
|
|
||||||
Example of a runner sending status updates:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
garm-cli runner show garm-DvxiVAlfHeE7
|
|
||||||
+-----------------+------------------------------------------------------------------------------------+
|
|
||||||
| FIELD | VALUE |
|
|
||||||
+-----------------+------------------------------------------------------------------------------------+
|
|
||||||
| ID | 16b96ba2-d406-45b8-ab66-b70be6237b4e |
|
|
||||||
| Provider ID | garm-DvxiVAlfHeE7 |
|
|
||||||
| Name | garm-DvxiVAlfHeE7 |
|
|
||||||
| OS Type | linux |
|
|
||||||
| OS Architecture | amd64 |
|
|
||||||
| OS Name | ubuntu |
|
|
||||||
| OS Version | jammy |
|
|
||||||
| Status | running |
|
|
||||||
| Runner Status | idle |
|
|
||||||
| Pool ID | 8ec34c1f-b053-4a5d-80d6-40afdfb389f9 |
|
|
||||||
| Addresses | 10.198.117.120 |
|
|
||||||
| Status Updates | 2023-07-08T06:26:46: runner registration token was retrieved |
|
|
||||||
| | 2023-07-08T06:26:46: using cached runner found in /opt/cache/actions-runner/latest |
|
|
||||||
| | 2023-07-08T06:26:50: configuring runner |
|
|
||||||
| | 2023-07-08T06:26:56: runner successfully configured after 1 attempt(s) |
|
|
||||||
| | 2023-07-08T06:26:56: installing runner service |
|
|
||||||
| | 2023-07-08T06:26:56: starting service |
|
|
||||||
| | 2023-07-08T06:26:57: runner successfully installed |
|
|
||||||
+-----------------+------------------------------------------------------------------------------------+
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
This URL must be set and must be accessible by the instance. If you wish to restrict access to it, a reverse proxy can be configured to accept requests only from networks in which the runners ```garm``` manages will be spun up. This URL doesn't need to be globally accessible, it just needs to be accessible by the instances.
|
|
||||||
|
|
||||||
For example, in a scenario where you expose the API endpoint directly, this setting could look like the following:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
callback_url = "https://garm.example.com/api/v1/callbacks"
|
|
||||||
```
|
|
||||||
|
|
||||||
Authentication is done using a short-lived JWT token, that gets generated for a particular instance that we are spinning up. That JWT token grants access to the instance to only update it's own status and to fetch metadata for itself. No other API endpoints will work with that JWT token. The validity of the token is equal to the pool bootstrap timeout value (default 20 minutes) plus the garm polling interval (5 minutes).
|
|
||||||
|
|
||||||
There is a sample ```nginx``` config [in the testdata folder](/testdata/nginx-server.conf). Feel free to customize it whichever way you see fit.
|
|
||||||
|
|
||||||
## The metadata_url option
|
|
||||||
|
|
||||||
The metadata URL is the base URL for any information an instance may need to fetch in order to finish setting itself up. As this URL may be placed behind a reverse proxy, you'll need to configure it in the ```garm``` config file. Ultimately this URL will need to point to the following ```garm``` API endpoint:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
GET /api/v1/metadata
|
|
||||||
```
|
|
||||||
|
|
||||||
This URL needs to be accessible only by the instances ```garm``` sets up. This URL will not be used by anyone else. To configure it in ```garm``` add the following line in the ```[default]``` section of your ```garm``` config:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
metadata_url = "https://garm.example.com/api/v1/metadata"
|
|
||||||
```
|
|
||||||
|
|
||||||
## The debug_server option
|
|
||||||
|
|
||||||
GARM can optionally enable the golang profiling server. This is useful if you suspect garm may be have a bottleneck in any way. To enable the profiling server, add the following section to the garm config:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[default]
|
|
||||||
|
|
||||||
debug_server = true
|
|
||||||
```
|
|
||||||
|
|
||||||
And restart garm. You can then use the following command to start profiling:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
go tool pprof http://127.0.0.1:9997/debug/pprof/profile?seconds=120
|
|
||||||
```
|
|
||||||
|
|
||||||
Important note on profiling when behind a reverse proxy. The above command will hang for a fairly long time. Most reverse proxies will timeout after about 60 seconds. To avoid this, you should only profile on localhost by connecting directly to garm.
|
|
||||||
|
|
||||||
It's also advisable to exclude the debug server URLs from your reverse proxy and only make them available locally.
|
|
||||||
|
|
||||||
Now that the debug server is enabled, here is a blog post on how to profile golang applications: https://blog.golang.org/profiling-go-programs
|
|
||||||
|
|
||||||
|
|
||||||
## The log_file option
|
|
||||||
|
|
||||||
By default, GARM logs everything to standard output.
|
|
||||||
|
|
||||||
You can optionally log to file by adding the following to your config file:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[default]
|
|
||||||
# Use this if you'd like to log to a file instead of standard output.
|
|
||||||
log_file = "/tmp/runner-manager.log"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Rotating log files
|
|
||||||
|
|
||||||
GARM automatically rotates the log if it reaches 500 MB in size or 28 days, whichever comes first.
|
|
||||||
|
|
||||||
However, if you want to manually rotate the log file, you can send a `SIGHUP` signal to the GARM process.
|
|
||||||
|
|
||||||
You can add the following to your systemd unit file to enable `reload`:
|
|
||||||
|
|
||||||
```ini
|
|
||||||
[Service]
|
|
||||||
ExecReload=/bin/kill -HUP $MAINPID
|
|
||||||
```
|
|
||||||
|
|
||||||
Then you can simply:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
systemctl reload garm
|
|
||||||
```
|
|
||||||
|
|
||||||
## The enable_log_streamer option
|
|
||||||
|
|
||||||
This option allows you to stream garm logs directly to your terminal. Set this option to true, then you can use the following command to stream logs:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
garm-cli debug-log
|
|
||||||
```
|
|
||||||
|
|
||||||
An important note on enabling this option when behind a reverse proxy. The log streamer uses websockets to stream logs to you. You will need to configure your reverse proxy to allow websocket connections. If you're using nginx, you will need to add the following to your nginx `server` config:
|
|
||||||
|
|
||||||
```nginx
|
|
||||||
location /api/v1/ws {
|
|
||||||
proxy_pass http://garm_backend;
|
|
||||||
proxy_http_version 1.1;
|
|
||||||
proxy_set_header Upgrade $http_upgrade;
|
|
||||||
proxy_set_header Connection "Upgrade";
|
|
||||||
proxy_set_header Host $host;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
@ -1,18 +0,0 @@
|
||||||
# The JWT authentication config section
|
|
||||||
|
|
||||||
This section configures the JWT authentication used by the API server. GARM is currently a single user system and that user has the right to do anything and everything GARM is capable of. As a result, the JWT auth we have does not include a refresh token. The token is valid for the duration of the time to live (TTL) set in the config file. Once the token expires, you will need to log in again.
|
|
||||||
|
|
||||||
It is recommended that the secret be a long, randomly generated string. Changing the secret at any time will invalidate all existing tokens.
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[jwt_auth]
|
|
||||||
# A JWT token secret used to sign tokens. Obviously, this needs to be changed :).
|
|
||||||
secret = ")9gk_4A6KrXz9D2u`0@MPea*sd6W`%@5MAWpWWJ3P3EqW~qB!!(Vd$FhNc*eU4vG"
|
|
||||||
|
|
||||||
# Time to live for tokens. Both the instances and you will use JWT tokens to
|
|
||||||
# authenticate against the API. However, this TTL is applied only to tokens you
|
|
||||||
# get when logging into the API. The tokens issued to the instances we manage,
|
|
||||||
# have a TTL based on the runner bootstrap timeout set on each pool. The minimum
|
|
||||||
# TTL for this token is 24h.
|
|
||||||
time_to_live = "8760h"
|
|
||||||
```
|
|
||||||
|
|
@ -1,35 +0,0 @@
|
||||||
# The logging section
|
|
||||||
|
|
||||||
GARM has switched to the `slog` package for logging, adding structured logging. As such, we added a dedicated `logging` section to the config to tweak the logging settings. We moved the `enable_log_streamer` and the `log_file` options from the `default` section to the `logging` section. They are still available in the `default` section for backwards compatibility, but they are deprecated and will be removed in a future release.
|
|
||||||
|
|
||||||
An example of the new `logging` section:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[logging]
|
|
||||||
# Uncomment this line if you'd like to log to a file instead of standard output.
|
|
||||||
# log_file = "/tmp/runner-manager.log"
|
|
||||||
|
|
||||||
# enable_log_streamer enables streaming the logs over websockets
|
|
||||||
enable_log_streamer = true
|
|
||||||
# log_format is the output format of the logs. GARM uses structured logging and can
|
|
||||||
# output as "text" or "json"
|
|
||||||
log_format = "text"
|
|
||||||
# log_level is the logging level GARM will output. Available log levels are:
|
|
||||||
# * debug
|
|
||||||
# * info
|
|
||||||
# * warn
|
|
||||||
# * error
|
|
||||||
log_level = "debug"
|
|
||||||
# log_source will output information about the function that generated the log line.
|
|
||||||
log_source = false
|
|
||||||
```
|
|
||||||
|
|
||||||
By default GARM logs everything to standard output. You can optionally log to file by adding the `log_file` option to the `logging` section. The `enable_log_streamer` option allows you to stream GARM logs directly to your terminal. Set this option to `true`, then you can use the following command to stream logs:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
garm-cli debug-log
|
|
||||||
```
|
|
||||||
|
|
||||||
The `log_format`, `log_level` and `log_source` options allow you to tweak the logging output. The `log_format` option can be set to `text` or `json`. The `log_level` option can be set to `debug`, `info`, `warn` or `error`. The `log_source` option will output information about the function that generated the log line. All these options influence how the structured logging is output.
|
|
||||||
|
|
||||||
This will allow you to ingest GARM logs in a central location such as an ELK stack or similar.
|
|
||||||
|
|
@ -1,118 +0,0 @@
|
||||||
# The metrics section
|
|
||||||
|
|
||||||
This is one of the features in GARM that I really love having. For one thing, it's community contributed and for another, it really adds value to the project. It allows us to create some pretty nice visualizations of what is happening with GARM.
|
|
||||||
|
|
||||||
## Common metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|--------------------------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
|
|
||||||
| `garm_health` | Gauge | `controller_id`=<controller id> <br>`callback_url`=<callback url> <br>`controller_webhook_url`=<controller webhook url> <br>`metadata_url`=<metadata url> <br>`webhook_url`=<webhook url> <br>`name`=<hostname> | This is a gauge that is set to 1 if GARM is healthy and 0 if it is not. This is useful for alerting. |
|
|
||||||
| `garm_webhooks_received` | Counter | `valid`=<valid request> <br>`reason`=<reason for invalid requests> | This is a counter that increments every time GARM receives a webhook from GitHub. |
|
|
||||||
|
|
||||||
## Enterprise metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|---------------------------------------|-------|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|
|
|
||||||
| `garm_enterprise_info` | Gauge | `id`=<enterprise id> <br>`name`=<enterprise name> | This is a gauge that is set to 1 and expose enterprise information |
|
|
||||||
| `garm_enterprise_pool_manager_status` | Gauge | `id`=<enterprise id> <br>`name`=<enterprise name> <br>`running`=<true\|false> | This is a gauge that is set to 1 if the enterprise pool manager is running and set to 0 if not |
|
|
||||||
|
|
||||||
## Organization metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|-----------------------------------------|-------|-----------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
|
|
||||||
| `garm_organization_info` | Gauge | `id`=<organization id> <br>`name`=<organization name> | This is a gauge that is set to 1 and expose organization information |
|
|
||||||
| `garm_organization_pool_manager_status` | Gauge | `id`=<organization id> <br>`name`=<organization name> <br>`running`=<true\|false> | This is a gauge that is set to 1 if the organization pool manager is running and set to 0 if not |
|
|
||||||
|
|
||||||
## Repository metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|---------------------------------------|-------|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|
|
|
||||||
| `garm_repository_info` | Gauge | `id`=<repository id> <br>`name`=<repository name> | This is a gauge that is set to 1 and expose repository information |
|
|
||||||
| `garm_repository_pool_manager_status` | Gauge | `id`=<repository id> <br>`name`=<repository name> <br>`running`=<true\|false> | This is a gauge that is set to 1 if the repository pool manager is running and set to 0 if not |
|
|
||||||
|
|
||||||
## Provider metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|----------------------|-------|-------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------|
|
|
||||||
| `garm_provider_info` | Gauge | `description`=<provider description> <br>`name`=<provider name> <br>`type`=<internal\|external> | This is a gauge that is set to 1 and expose provider information |
|
|
||||||
|
|
||||||
## Pool metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|-------------------------------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|
|
|
||||||
| `garm_pool_info` | Gauge | `flavor`=<flavor> <br>`id`=<pool id> <br>`image`=<image name> <br>`os_arch`=<defined OS arch> <br>`os_type`=<defined OS name> <br>`pool_owner`=<owner name> <br>`pool_type`=<repository\|organization\|enterprise> <br>`prefix`=<prefix> <br>`provider`=<provider name> <br>`tags`=<concatenated list of pool tags> <br> | This is a gauge that is set to 1 and expose pool information |
|
|
||||||
| `garm_pool_status` | Gauge | `enabled`=<true\|false> <br>`id`=<pool id> | This is a gauge that is set to 1 if the pool is enabled and set to 0 if not |
|
|
||||||
| `garm_pool_bootstrap_timeout` | Gauge | `id`=<pool id> | This is a gauge that is set to the pool bootstrap timeout |
|
|
||||||
| `garm_pool_max_runners` | Gauge | `id`=<pool id> | This is a gauge that is set to the pool max runners |
|
|
||||||
| `garm_pool_min_idle_runners` | Gauge | `id`=<pool id> | This is a gauge that is set to the pool min idle runners |
|
|
||||||
|
|
||||||
## Runner metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|--------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
|
|
||||||
| `garm_runner_status` | Gauge | `name`=<runner name> <br>`pool_owner`=<owner name> <br>`pool_type`=<repository\|organization\|enterprise> <br>`provider`=<provider name> <br>`runner_status`=<running\|stopped\|error\|pending_delete\|deleting\|pending_create\|creating\|unknown> <br>`status`=<idle\|pending\|terminated\|installing\|failed\|active> <br> | This is a gauge value that gives us details about the runners garm spawns |
|
|
||||||
| `garm_runner_operations_total` | Counter | `provider`=<provider name> <br>`operation`=<CreateInstance\|DeleteInstance\|GetInstance\|ListInstances\|RemoveAllInstances\|Start\Stop> | This is a counter that increments every time a runner operation is performed |
|
|
||||||
| `garm_runner_errors_total` | Counter | `provider`=<provider name> <br>`operation`=<CreateInstance\|DeleteInstance\|GetInstance\|ListInstances\|RemoveAllInstances\|Start\Stop> | This is a counter that increments every time a runner operation errored |
|
|
||||||
|
|
||||||
## Github metrics
|
|
||||||
|
|
||||||
| Metric name | Type | Labels | Description |
|
|
||||||
|--------------------------------|---------|------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|
|
|
||||||
| `garm_github_operations_total` | Counter | `operation`=<ListRunners\|CreateRegistrationToken\|...> <br>`scope`=<Organization\|Repository\|Enterprise> | This is a counter that increments every time a github operation is performed |
|
|
||||||
| `garm_github_errors_total` | Counter | `operation`=<ListRunners\|CreateRegistrationToken\|...> <br>`scope`=<Organization\|Repository\|Enterprise> | This is a counter that increments every time a github operation errored |
|
|
||||||
|
|
||||||
## Enabling metrics
|
|
||||||
|
|
||||||
Metrics are disabled by default. To enable them, add the following to your config file:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[metrics]
|
|
||||||
|
|
||||||
# Toggle to disable authentication (not recommended) on the metrics endpoint.
|
|
||||||
# If you do disable authentication, I encourage you to put a reverse proxy in front
|
|
||||||
# of garm and limit which systems can access that particular endpoint. Ideally, you
|
|
||||||
# would enable some kind of authentication using the reverse proxy, if the built-in auth
|
|
||||||
# is not sufficient for your needs.
|
|
||||||
#
|
|
||||||
# Default: false
|
|
||||||
disable_auth = true
|
|
||||||
|
|
||||||
# Toggle metrics. If set to false, the API endpoint for metrics collection will
|
|
||||||
# be disabled.
|
|
||||||
#
|
|
||||||
# Default: false
|
|
||||||
enable = true
|
|
||||||
|
|
||||||
# period is the time interval when the /metrics endpoint will update internal metrics about
|
|
||||||
# controller specific objects (e.g. runners, pools, etc.)
|
|
||||||
#
|
|
||||||
# Default: "60s"
|
|
||||||
period = "30s"
|
|
||||||
```
|
|
||||||
|
|
||||||
You can choose to disable authentication if you wish, however it's not terribly difficult to set up, so I generally advise against disabling it.
|
|
||||||
|
|
||||||
## Configuring prometheus
|
|
||||||
|
|
||||||
The following section assumes that your garm instance is running at `garm.example.com` and has TLS enabled.
|
|
||||||
|
|
||||||
First, generate a new JWT token valid only for the metrics endpoint:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
garm-cli metrics-token create
|
|
||||||
```
|
|
||||||
|
|
||||||
Note: The token validity is equal to the TTL you set in the [JWT config section](/doc/config_jwt_auth.md).
|
|
||||||
|
|
||||||
Copy the resulting token, and add it to your prometheus config file. The following is an example of how to add garm as a target in your prometheus config file:
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
scrape_configs:
|
|
||||||
- job_name: "garm"
|
|
||||||
# Connect over https. If you don't have TLS enabled, change this to http.
|
|
||||||
scheme: https
|
|
||||||
static_configs:
|
|
||||||
- targets: ["garm.example.com"]
|
|
||||||
authorization:
|
|
||||||
credentials: "superSecretTokenYouGeneratedEarlier"
|
|
||||||
```
|
|
||||||
208
doc/events.md
208
doc/events.md
|
|
@ -1,6 +1,6 @@
|
||||||
# GARM database events
|
# GARM database events
|
||||||
|
|
||||||
Starting with GARM version `v0.1.5`, we now have a new websocket endpoint that allows us to subscribe to some events that are emited by the database watcher. Whenever a database entity is created, updated or deleted, the database watcher will notify all interested consumers that an event has occured, and as part of that event, the database entity that was affected.
|
Starting with GARM version `v0.1.5`, we now have a new websocket endpoint that allows us to subscribe to some events that are emited by the database watcher. Whenever a database entity is created, updated or deleted, the database watcher will notify all interested consumers that an event has occured and as part of that event, we get a copy of the database entity that was affected.
|
||||||
|
|
||||||
For example, if a new runner is created, the watcher will emit a `Create` event for the `Instances` entity and in the `Payload` field, we will have a copy of the `Instance` entity that was created. Internally, this will be a golang struct, but when exported via the websocket endpoint, it will be a JSON object, with all sensitive info (passwords, keys, secrets in general) stripped out.
|
For example, if a new runner is created, the watcher will emit a `Create` event for the `Instances` entity and in the `Payload` field, we will have a copy of the `Instance` entity that was created. Internally, this will be a golang struct, but when exported via the websocket endpoint, it will be a JSON object, with all sensitive info (passwords, keys, secrets in general) stripped out.
|
||||||
|
|
||||||
|
|
@ -43,10 +43,212 @@ The event structure is defined in the [database common package](https://github.c
|
||||||
|
|
||||||
Where the `payload` will be a JSON representation of one of the entities defined above. Essentially, you can expect to receive a JSON identical to the one you would get if you made an API call to the GARM REST API for that particular entity.
|
Where the `payload` will be a JSON representation of one of the entities defined above. Essentially, you can expect to receive a JSON identical to the one you would get if you made an API call to the GARM REST API for that particular entity.
|
||||||
|
|
||||||
Note that in some cases, the `delete` operation will return the full object prior to the deletion of the entity, while others will only ever return the `ID` of the entity. This will probably be changed in future releases to only return the `ID` in case of a `delete` operation, for all entities.
|
Note that in some cases, the `delete` operation will return the full object prior to the deletion of the entity, while others will only ever return the `ID` of the entity. This will probably be changed in future releases to only return the `ID` in case of a `delete` operation, for all entities. You should operate under the assumption that in the future, delete operations will only return the `ID` of the entity.
|
||||||
|
|
||||||
# Subscribing to events
|
# Subscribing to events
|
||||||
|
|
||||||
By default the events endpoint returns no events. All events are filtered by default. To start receiving events, you need to emit a message on the websocket connection indicating the entities and/or operations you're interested in.
|
By default the events endpoint returns no events. All events are filtered by default. To start receiving events, you need to emit a message on the websocket connection indicating the entities and/or operations you're interested in.
|
||||||
|
|
||||||
This gives you the option to get fine grained control over what you receive at any given point in time. Of course, you can opt to receive everything and deal with the potential deluge (depends on how busy your GARM instance is) on your own.
|
This gives you the option to get fine grained control over what you receive at any given point in time. Of course, you can opt to receive everything and deal with the potential deluge (depends on how busy your GARM instance is) on your own.
|
||||||
|
|
||||||
|
## The filter message
|
||||||
|
|
||||||
|
The filter is defined as a JSON that you write over the websocket connections. That JSON must adhere to the following schema:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"$id": "https://github.com/cloudbase/garm/apiserver/events/options",
|
||||||
|
"$ref": "#/$defs/Options",
|
||||||
|
"$defs": {
|
||||||
|
"Filter": {
|
||||||
|
"properties": {
|
||||||
|
"operations": {
|
||||||
|
"items": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": [
|
||||||
|
"create",
|
||||||
|
"update",
|
||||||
|
"delete"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"type": "array",
|
||||||
|
"title": "operations",
|
||||||
|
"description": "A list of operations to filter on"
|
||||||
|
},
|
||||||
|
"entity-type": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": [
|
||||||
|
"repository",
|
||||||
|
"organization",
|
||||||
|
"enterprise",
|
||||||
|
"pool",
|
||||||
|
"user",
|
||||||
|
"instance",
|
||||||
|
"job",
|
||||||
|
"controller",
|
||||||
|
"github_credentials",
|
||||||
|
"github_endpoint"
|
||||||
|
],
|
||||||
|
"title": "entity type",
|
||||||
|
"description": "The type of entity to filter on",
|
||||||
|
"default": "repository"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"additionalProperties": false,
|
||||||
|
"type": "object"
|
||||||
|
},
|
||||||
|
"Options": {
|
||||||
|
"properties": {
|
||||||
|
"send-everything": {
|
||||||
|
"type": "boolean",
|
||||||
|
"title": "send everything",
|
||||||
|
"default": false
|
||||||
|
},
|
||||||
|
"filters": {
|
||||||
|
"items": {
|
||||||
|
"$ref": "#/$defs/Filter"
|
||||||
|
},
|
||||||
|
"type": "array",
|
||||||
|
"title": "filters",
|
||||||
|
"description": "A list of filters to apply to the events. This is ignored when send-everything is true"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"additionalProperties": false,
|
||||||
|
"type": "object"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
But I realize a JSON schema is not the best way to explain how to use the filter. The following examples should give you a better idea of how to use the filter.
|
||||||
|
|
||||||
|
### Example 1: Send all events
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"send-everything": true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 2: Send only `create` events for `repository` entities
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"send-everything": false,
|
||||||
|
"filters": [
|
||||||
|
{
|
||||||
|
"entity-type": "repository",
|
||||||
|
"operations": ["create"]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example 3: Send `create` and `update` for repositories and `delete` for instances
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"send-everything": false,
|
||||||
|
"filters": [
|
||||||
|
{
|
||||||
|
"entity-type": "repository",
|
||||||
|
"operations": ["create", "update"]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"entity-type": "instance",
|
||||||
|
"operations": ["delete"]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Connecting to the events endpoint
|
||||||
|
|
||||||
|
You can use any websocket client, written in any programming language to interact with the events endpoint. In the following exmple I'll show you how to do it from go.
|
||||||
|
|
||||||
|
Before we start, we'll need a JWT token to access the events endpoint. Normally, if you use the CLI, you should have it in your `~/.local/share/garm-cli` folder. But if you know your username and password, we can fetch a fresh one using `curl`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Read the password from the terminal
|
||||||
|
read -s PASSWD
|
||||||
|
|
||||||
|
# Get the token
|
||||||
|
curl -s -X POST -d '{"username": "admin", "password": "'$PASSWD'"}' \
|
||||||
|
https://garm.example.com/api/v1/auth/login | jq -r .token
|
||||||
|
```
|
||||||
|
|
||||||
|
Save the token, we'll need it for later.
|
||||||
|
|
||||||
|
Now, let's write a simple go program that connects to the events endpoint and subscribes to all events. We'll use the reader that was added to [`garm-provider-common`](https://github.com/cloudbase/garm-provider-common) in version `v0.1.3`, to make this easier:
|
||||||
|
|
||||||
|
```go
|
||||||
|
package main
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"fmt"
|
||||||
|
"os"
|
||||||
|
"os/signal"
|
||||||
|
"syscall"
|
||||||
|
|
||||||
|
garmWs "github.com/cloudbase/garm-provider-common/util/websocket"
|
||||||
|
"github.com/gorilla/websocket"
|
||||||
|
)
|
||||||
|
|
||||||
|
// List of signals to interrupt the program
|
||||||
|
var signals = []os.Signal{
|
||||||
|
os.Interrupt,
|
||||||
|
syscall.SIGTERM,
|
||||||
|
}
|
||||||
|
|
||||||
|
// printToConsoleHandler is a simple function that prints the message to the console.
|
||||||
|
// In a real world implementation, you can use this function to decide how to properly
|
||||||
|
// handle the events.
|
||||||
|
func printToConsoleHandler(_ int, msg []byte) error {
|
||||||
|
fmt.Println(string(msg))
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
// Set up the context to listen for signals.
|
||||||
|
ctx, stop := signal.NotifyContext(context.Background(), signals...)
|
||||||
|
defer stop()
|
||||||
|
|
||||||
|
// This is the JWT token you got from the curl command above.
|
||||||
|
token := "superSecretJWTToken"
|
||||||
|
// The base URL of your GARM server
|
||||||
|
baseURL := "https://garm.example.com"
|
||||||
|
// This is the path to the events endpoint
|
||||||
|
pth := "/api/v1/ws/events"
|
||||||
|
|
||||||
|
// Instantiate the websocket reader
|
||||||
|
reader, err := garmWs.NewReader(ctx, baseURL, pth, token, printToConsoleHandler)
|
||||||
|
if err != nil {
|
||||||
|
fmt.Println(err)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Start the loop.
|
||||||
|
if err := reader.Start(); err != nil {
|
||||||
|
fmt.Println(err)
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Set the filter to receive all events. You can use a more fine grained filter if you wish.
|
||||||
|
reader.WriteMessage(websocket.TextMessage, []byte(`{"send-everything":true}`))
|
||||||
|
|
||||||
|
fmt.Println("Listening for events. Press Ctrl+C to stop.")
|
||||||
|
// Wait for the context to be done.
|
||||||
|
<-ctx.Done()
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
If you run this program and change something in the GARM database, you should see the event being printed to the console:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
gabriel@rossak:/tmp/ex$ go run ./main.go
|
||||||
|
{"entity-type":"pool","operation":"update","payload":{"runner_prefix":"garm","id":"8ec34c1f-b053-4a5d-80d6-40afdfb389f9","provider_name":"lxd","max_runners":10,"min_idle_runners":0,"image":"ubuntu:22.04","flavor":"default","os_type":"linux","os_arch":"amd64","tags":[{"id":"76781c93-e354-402e-907a-785caab36207","name":"self-hosted"},{"id":"2ff4a89e-e3b4-4e78-b977-6c21e83cca3d","name":"x64"},{"id":"5b3ffec6-0402-4322-b2a9-fa7f692bbc00","name":"Linux"},{"id":"e95e106d-1a3d-11ee-bd1d-00163e1f621a","name":"ubuntu"},{"id":"3b54ae6c-5e9b-4a81-8e6c-0f78a7b37b04","name":"repo"}],"enabled":true,"instances":[],"repo_id":"70227434-e7c0-4db1-8c17-e9ae3683f61e","repo_name":"gsamfira/scripts","runner_bootstrap_timeout":20,"extra_specs":{"disable_updates":true,"enable_boot_debug":true},"github-runner-group":"","priority":10}}
|
||||||
|
```
|
||||||
|
|
||||||
|
In the above example, you can see an `update` event on a `pool` entity. The `payload` field contains the full, updated `pool` entity.
|
||||||
|
|
|
||||||
|
|
@ -1,31 +1,31 @@
|
||||||
# Writing an external provider
|
# Writing an external provider
|
||||||
|
|
||||||
External provider enables you to write a fully functional provider, using any scripting or programming language. Garm will call your executable to manage the lifecycle of the instances hosting the runners. This document describes the API that an executable needs to implement to be usable by ```garm```.
|
External provider enables you to write a fully functional provider, using any scripting or programming language. Garm will call your executable to manage the lifecycle of the instances hosting the runners. This document describes the API that an executable needs to implement to be usable by `garm`.
|
||||||
|
|
||||||
## Environment variables
|
## Environment variables
|
||||||
|
|
||||||
When ```garm``` calls your executable, a number of environment variables are set, depending on the operation. There are three environment variables that will always be set regardless of operation. Those variables are:
|
When `garm` calls your executable, a number of environment variables are set, depending on the operation. There are three environment variables that will always be set regardless of operation. Those variables are:
|
||||||
|
|
||||||
* ```GARM_COMMAND```
|
* `GARM_COMMAND`
|
||||||
* ```GARM_PROVIDER_CONFIG_FILE```
|
* `GARM_PROVIDER_CONFIG_FILE`
|
||||||
* ```GARM_CONTROLLER_ID```
|
* `GARM_CONTROLLER_ID`
|
||||||
|
|
||||||
The following are variables that are specific to some operations:
|
The following are variables that are specific to some operations:
|
||||||
|
|
||||||
* ```GARM_POOL_ID```
|
* `GARM_POOL_ID`
|
||||||
* ```GARM_INSTANCE_ID```
|
* `GARM_INSTANCE_ID`
|
||||||
|
|
||||||
### The GARM_COMMAND variable
|
### The GARM_COMMAND variable
|
||||||
|
|
||||||
The ```GARM_COMMAND``` environment variable will be set to one of the operations defined in the interface. When your executable is called, you'll need to inspect this variable to know which operation you need to execute.
|
The `GARM_COMMAND` environment variable will be set to one of the operations defined in the interface. When your executable is called, you'll need to inspect this variable to know which operation you need to execute.
|
||||||
|
|
||||||
### The GARM_PROVIDER_CONFIG_FILE variable
|
### The GARM_PROVIDER_CONFIG_FILE variable
|
||||||
|
|
||||||
The ```GARM_PROVIDER_CONFIG_FILE``` variable will contain a path on disk to a file that can contain whatever configuration your executable needs. For example, in the case of the [sample OpenStack external provider](../contrib/providers.d/openstack/keystonerc), this file contains variables that you would normally find in a ```keystonerc``` file, used to access an OpenStack cloud. But you can use it to add any extra configuration you need.
|
The `GARM_PROVIDER_CONFIG_FILE` variable will contain a path on disk to a file that can contain whatever configuration your executable needs. For example, in the case of the [OpenStack external provider](https://github.com/cloudbase/garm-provider-openstack), this file is a toml which contains provider specific configuration options. The provider author decides what this file needs to contain for the provider to function properly.
|
||||||
|
|
||||||
The config is opaque to ```garm``` itself. It only has meaning for your external provider.
|
GARM does not read this file in any way. It is simply passed to the executable via the environment variable.
|
||||||
|
|
||||||
In your executable, you could implement something like this:
|
The OpenStack provider mentioned above is written in Go, but it doesn't need to be. For example, if your provider is written in BASH, handling the config file could look something like this:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
if [ -f "${GARM_PROVIDER_CONFIG_FILE}" ];then
|
if [ -f "${GARM_PROVIDER_CONFIG_FILE}" ];then
|
||||||
|
|
@ -54,37 +54,37 @@ esac
|
||||||
|
|
||||||
### The GARM_CONTROLLER_ID variable
|
### The GARM_CONTROLLER_ID variable
|
||||||
|
|
||||||
The ```GARM_CONTROLLER_ID``` variable is set for all operations.
|
The `GARM_CONTROLLER_ID` variable is set for all operations.
|
||||||
|
|
||||||
When garm first starts up, it generates a unique ID that identifies it as an instance. This ID is passed to the provider and should always be used to tag resources in whichever cloud you write your provider for. This ensures that if you have multiple garm installations, one particular deployment of garm will never touch any resources it did not create.
|
When garm first starts up, it generates a unique ID that identifies it as an instance. This ID is passed to the provider and should always be used to tag resources in whichever cloud you write your provider for. This ensures that if you have multiple garm installations, one particular deployment of garm will never touch any resources it did not create.
|
||||||
|
|
||||||
In most clouds you can attach ```tags``` to resources. You can use the controller ID as one of the tags during the ```CreateInstance``` operation.
|
In most clouds you can attach `tags` to resources. You can use the controller ID as one of the tags during the `CreateInstance` operation.
|
||||||
|
|
||||||
### The GARM_POOL_ID variable
|
### The GARM_POOL_ID variable
|
||||||
|
|
||||||
The ```GARM_POOL_ID``` environment variable is a ```UUID4``` describing the pool in which a runner is created. This variable is set in two operations:
|
The `GARM_POOL_ID` environment variable is a `UUID4` describing the pool in which a runner is created. This variable is set in two operations:
|
||||||
|
|
||||||
* CreateInstance
|
* CreateInstance
|
||||||
* ListInstances
|
* ListInstances
|
||||||
|
|
||||||
As with the ```GARM_CONTROLLER_ID```, this ID **must** also be attached as a tag or whichever mechanism your target cloud supports, to identify the pool to which the resources (in most cases the VMs) belong to.
|
As with the `GARM_CONTROLLER_ID`, this ID **must** also be attached as a tag or whichever mechanism your target cloud supports, to identify the pool to which the resources (in most cases the VMs) belong to.
|
||||||
|
|
||||||
### The GARM_INSTANCE_ID variable
|
### The GARM_INSTANCE_ID variable
|
||||||
|
|
||||||
The ```GARM_INSTANCE_ID``` environment variable is used in four operations:
|
The `GARM_INSTANCE_ID` environment variable is used in four operations:
|
||||||
|
|
||||||
* GetInstance
|
* GetInstance
|
||||||
* DeleteInstance
|
* DeleteInstance
|
||||||
* Start
|
* Start
|
||||||
* Stop
|
* Stop
|
||||||
|
|
||||||
It contains the ```provider_id``` of the instance. The ```provider_id``` is a unique identifier, specific to the IaaS in which the compute resource was created. In OpenStack, it's an ```UUID4```, while in LXD, it's the virtual machine's name.
|
It contains the `provider_id` of the instance. The `provider_id` is a unique identifier, specific to the IaaS in which the compute resource was created. In OpenStack, it's an `UUID4`, while in LXD, it's the virtual machine's name.
|
||||||
|
|
||||||
We need this ID whenever we need to execute an operation that targets one specific runner.
|
We need this ID whenever we need to execute an operation that targets one specific runner.
|
||||||
|
|
||||||
## Operations
|
## Operations
|
||||||
|
|
||||||
The operations that a provider must implement are described in the ```Provider``` [interface available here](https://github.com/cloudbase/garm/blob/223477c4ddfb6b6f9079c444d2f301ef587f048b/runner/providers/external/execution/interface.go#L9-L27). The external provider implements this interface, and delegates each operation to your external executable. [These operations are](https://github.com/cloudbase/garm/blob/223477c4ddfb6b6f9079c444d2f301ef587f048b/runner/providers/external/execution/commands.go#L5-L13):
|
The operations that a provider must implement are described in the `Provider` [interface available here](https://github.com/cloudbase/garm/blob/223477c4ddfb6b6f9079c444d2f301ef587f048b/runner/providers/external/execution/interface.go#L9-L27). The external provider implements this interface, and delegates each operation to your external executable. [These operations are](https://github.com/cloudbase/garm/blob/223477c4ddfb6b6f9079c444d2f301ef587f048b/runner/providers/external/execution/commands.go#L5-L13):
|
||||||
|
|
||||||
* CreateInstance
|
* CreateInstance
|
||||||
* DeleteInstance
|
* DeleteInstance
|
||||||
|
|
@ -96,30 +96,30 @@ The operations that a provider must implement are described in the ```Provider``
|
||||||
|
|
||||||
## CreateInstance
|
## CreateInstance
|
||||||
|
|
||||||
The ```CreateInstance``` command has the most moving parts. The ideal external provider is one that will create all required resources for a fully functional instance, will start the instance. Waiting for the instance to start is not necessary. If the instance can reach the ```callback_url``` configured in ```garm```, it will update it's own status when it starts running the userdata script.
|
The `CreateInstance` command has the most moving parts. The ideal external provider is one that will create all required resources for a fully functional instance, will start the instance. Waiting for the instance to start is not necessary. If the instance can reach the `callback_url` configured in `garm`, it will update it's own status when it starts running the userdata script.
|
||||||
|
|
||||||
But aside from creating resources, the ideal external provider is also idempotent, and will clean up after itself in case of failure. If for any reason the executable will fail to create the instance, any dependency that it has created up to the point of failure, should be cleaned up before returning an error code.
|
But aside from creating resources, the ideal external provider is also idempotent, and will clean up after itself in case of failure. If for any reason the executable will fail to create the instance, any dependency that it has created up to the point of failure, should be cleaned up before returning an error code.
|
||||||
|
|
||||||
At the very least, it must be able to clean up those resources, if it is called with the ```DeleteInstance``` command by ```garm```. Garm will retry creating a failed instance. Before it tries again, it will attempt to run a ```DeleteInstance``` using the ```provider_id``` returned by your executable.
|
At the very least, it must be able to clean up those resources, if it is called with the `DeleteInstance` command by `garm`. Garm will retry creating a failed instance. Before it tries again, it will attempt to run a `DeleteInstance` using the `provider_id` returned by your executable.
|
||||||
|
|
||||||
If your executable failed before a ```provider_id``` could be supplied, ```garm``` will send the name of the instance as a ```GARM_INSTANCE_ID``` environment variable.
|
If your executable failed before a `provider_id` could be supplied, `garm` will send the name of the instance as a `GARM_INSTANCE_ID` environment variable.
|
||||||
|
|
||||||
Your external provider will need to be able to handle both. The instance name generated by ```garm``` will be unique, so it's fairly safe to use when deleting instances.
|
Your external provider will need to be able to handle both. The instance name generated by `garm` will be unique, so it's fairly safe to use when deleting instances.
|
||||||
|
|
||||||
### CreateInstance inputs
|
### CreateInstance inputs
|
||||||
|
|
||||||
The ```CreateInstance``` command is the only command that needs to handle standard input. Garm will send the runner bootstrap information in stdin. The environment variables set for this command are:
|
The `CreateInstance` command is the only command that needs to handle standard input. Garm will send the runner bootstrap information in stdin. The environment variables set for this command are:
|
||||||
|
|
||||||
* GARM_PROVIDER_CONFIG_FILE - Config file specific to your executable
|
* GARM_PROVIDER_CONFIG_FILE - Config file specific to your executable
|
||||||
* GARM_COMMAND - the command we need to run
|
* GARM_COMMAND - the command we need to run
|
||||||
* GARM_CONTROLLER_ID - The unique ID of the ```garm``` installation
|
* GARM_CONTROLLER_ID - The unique ID of the `garm` installation
|
||||||
* GARM_POOL_ID - The unique ID of the pool this node is a part of
|
* GARM_POOL_ID - The unique ID of the pool this node is a part of
|
||||||
|
|
||||||
The information sent in via standard input is a ```json``` serialized instance of the [BootstrapInstance structure](https://github.com/cloudbase/garm/blob/6b3ea50ca54501595e541adde106703d289bb804/params/params.go#L164-L217)
|
The information sent in via standard input is a `json` serialized instance of the [BootstrapInstance structure](https://github.com/cloudbase/garm/blob/6b3ea50ca54501595e541adde106703d289bb804/params/params.go#L164-L217)
|
||||||
|
|
||||||
Here is a sample of that:
|
Here is a sample of that:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"name": "garm-ny9HeeQYw2rl",
|
"name": "garm-ny9HeeQYw2rl",
|
||||||
"tools": [
|
"tools": [
|
||||||
|
|
@ -194,20 +194,20 @@ Here is a sample of that:
|
||||||
],
|
],
|
||||||
"pool_id": "9dcf590a-1192-4a9c-b3e4-e0902974c2c0"
|
"pool_id": "9dcf590a-1192-4a9c-b3e4-e0902974c2c0"
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
In your executable you can read in this blob, by using something like this:
|
In your executable you can read in this blob, by using something like this:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Test if the stdin file descriptor is opened
|
# Test if the stdin file descriptor is opened
|
||||||
if [ ! -t 0 ]
|
if [ ! -t 0 ]
|
||||||
then
|
then
|
||||||
# Read in the information from standard in
|
# Read in the information from standard in
|
||||||
INPUT=$(cat -)
|
INPUT=$(cat -)
|
||||||
fi
|
fi
|
||||||
```
|
```
|
||||||
|
|
||||||
Then you can easily parse it. If you're using ```bash```, you can use the amazing [jq json processor](https://stedolan.github.io/jq/). Other programming languages have suitable libraries that can handle ```json```.
|
Then you can easily parse it. If you're using `bash`, you can use the amazing [jq json processor](https://stedolan.github.io/jq/). Other programming languages have suitable libraries that can handle `json`.
|
||||||
|
|
||||||
You will have to parse the bootstrap params, verify that the requested image exists, gather operating system information, CPU architecture information and using that information, you will need to select the appropriate tools for the arch/OS combination you are deploying.
|
You will have to parse the bootstrap params, verify that the requested image exists, gather operating system information, CPU architecture information and using that information, you will need to select the appropriate tools for the arch/OS combination you are deploying.
|
||||||
|
|
||||||
|
|
@ -220,11 +220,11 @@ Examples of external providers written in Go can be found at the following locat
|
||||||
|
|
||||||
### CreateInstance outputs
|
### CreateInstance outputs
|
||||||
|
|
||||||
On success, your executable is expected to print to standard output a json that can be deserialized into an ```Instance{}``` structure [defined here](https://github.com/cloudbase/garm/blob/6b3ea50ca54501595e541adde106703d289bb804/params/params.go#L90-L154).
|
On success, your executable is expected to print to standard output a json that can be deserialized into an `Instance{}` structure [defined here](https://github.com/cloudbase/garm/blob/6b3ea50ca54501595e541adde106703d289bb804/params/params.go#L90-L154).
|
||||||
|
|
||||||
Not all fields are expected to be populated by the provider. The ones that should be set are:
|
Not all fields are expected to be populated by the provider. The ones that should be set are:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"provider_id": "88818ff3-1fca-4cb5-9b37-84bfc3511ea6",
|
"provider_id": "88818ff3-1fca-4cb5-9b37-84bfc3511ea6",
|
||||||
"name": "garm-ny9HeeQYw2rl",
|
"name": "garm-ny9HeeQYw2rl",
|
||||||
|
|
@ -236,17 +236,17 @@ Not all fields are expected to be populated by the provider. The ones that shoul
|
||||||
"pool_id": "41c4a43a-acee-493a-965b-cf340b2c775d",
|
"pool_id": "41c4a43a-acee-493a-965b-cf340b2c775d",
|
||||||
"provider_fault": ""
|
"provider_fault": ""
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
In case of error, ```garm``` expects at the very least to see a non-zero exit code. If possible, your executable should return as much information as possible via the above ```json```, with the ```status``` field set to ```error``` and the ```provider_fault``` set to a meaningful error message describing what has happened. That information will be visible when doing a:
|
In case of error, `garm` expects at the very least to see a non-zero exit code. If possible, your executable should return as much information as possible via the above `json`, with the `status` field set to `error` and the `provider_fault` set to a meaningful error message describing what has happened. That information will be visible when doing a:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
garm-cli runner show <runner name>
|
garm-cli runner show <runner name>
|
||||||
```
|
```
|
||||||
|
|
||||||
## DeleteInstance
|
## DeleteInstance
|
||||||
|
|
||||||
The ```DeleteInstance``` command will permanently remove an instance from the cloud provider.
|
The `DeleteInstance` command will permanently remove an instance from the cloud provider.
|
||||||
|
|
||||||
The environment variables set for this command are:
|
The environment variables set for this command are:
|
||||||
|
|
||||||
|
|
@ -255,13 +255,13 @@ The environment variables set for this command are:
|
||||||
* GARM_INSTANCE_ID
|
* GARM_INSTANCE_ID
|
||||||
* GARM_PROVIDER_CONFIG_FILE
|
* GARM_PROVIDER_CONFIG_FILE
|
||||||
|
|
||||||
This command is not expected to output anything. On success it should simply ```exit 0```.
|
This command is not expected to output anything. On success it should simply `exit 0`.
|
||||||
|
|
||||||
If the target instance does not exist in the provider, this command is expected to be a no-op.
|
If the target instance does not exist in the provider, this command is expected to be a no-op.
|
||||||
|
|
||||||
## GetInstance
|
## GetInstance
|
||||||
|
|
||||||
The ```GetInstance``` command will return details about the instance, as seen by the provider.
|
The `GetInstance` command will return details about the instance, as seen by the provider.
|
||||||
|
|
||||||
The environment variables set for this command are:
|
The environment variables set for this command are:
|
||||||
|
|
||||||
|
|
@ -270,13 +270,13 @@ The environment variables set for this command are:
|
||||||
* GARM_INSTANCE_ID
|
* GARM_INSTANCE_ID
|
||||||
* GARM_PROVIDER_CONFIG_FILE
|
* GARM_PROVIDER_CONFIG_FILE
|
||||||
|
|
||||||
On success, this command is expected to return a valid ```json``` that can be deserialized into an ```Instance{}``` structure (see CreateInstance). If possible, IP addresses allocated to the VM should be returned in addition to the sample ```json``` printed above.
|
On success, this command is expected to return a valid `json` that can be deserialized into an `Instance{}` structure (see CreateInstance). If possible, IP addresses allocated to the VM should be returned in addition to the sample `json` printed above.
|
||||||
|
|
||||||
On failure, this command is expected to return a non-zero exit code.
|
On failure, this command is expected to return a non-zero exit code.
|
||||||
|
|
||||||
## ListInstances
|
## ListInstances
|
||||||
|
|
||||||
The ```ListInstances``` command will print to standard output, a json that is deserializable into an **array** of ```Instance{}```.
|
The `ListInstances` command will print to standard output, a json that is deserializable into an **array** of `Instance{}`.
|
||||||
|
|
||||||
The environment variables set for this command are:
|
The environment variables set for this command are:
|
||||||
|
|
||||||
|
|
@ -285,15 +285,15 @@ The environment variables set for this command are:
|
||||||
* GARM_PROVIDER_CONFIG_FILE
|
* GARM_PROVIDER_CONFIG_FILE
|
||||||
* GARM_POOL_ID
|
* GARM_POOL_ID
|
||||||
|
|
||||||
This command must list all instances that have been tagged with the value in ```GARM_POOL_ID```.
|
This command must list all instances that have been tagged with the value in `GARM_POOL_ID`.
|
||||||
|
|
||||||
On success, a ```json``` is expected on standard output.
|
On success, a `json` is expected on standard output.
|
||||||
|
|
||||||
On failure, a non-zero exit code is expected.
|
On failure, a non-zero exit code is expected.
|
||||||
|
|
||||||
## RemoveAllInstances
|
## RemoveAllInstances
|
||||||
|
|
||||||
The ```RemoveAllInstances``` operation will remove all resources created in a cloud that have been tagged with the ```GARM_CONTROLLER_ID```. External providers should tag all resources they create with the garm controller ID. That tag can then be used to identify all resources when attempting to delete all instances.
|
The `RemoveAllInstances` operation will remove all resources created in a cloud that have been tagged with the `GARM_CONTROLLER_ID`. External providers should tag all resources they create with the garm controller ID. That tag can then be used to identify all resources when attempting to delete all instances.
|
||||||
|
|
||||||
The environment variables set for this command are:
|
The environment variables set for this command are:
|
||||||
|
|
||||||
|
|
@ -309,7 +309,7 @@ Note: This command is currently not used by garm.
|
||||||
|
|
||||||
## Start
|
## Start
|
||||||
|
|
||||||
The ```Start``` operation will start the virtual machine in the selected cloud.
|
The `Start` operation will start the virtual machine in the selected cloud.
|
||||||
|
|
||||||
The environment variables set for this command are:
|
The environment variables set for this command are:
|
||||||
|
|
||||||
|
|
@ -324,9 +324,9 @@ On failure, a non-zero exit code is expected.
|
||||||
|
|
||||||
## Stop
|
## Stop
|
||||||
|
|
||||||
NOTE: This operation is currently not use by ```garm```, but should be implemented.
|
NOTE: This operation is currently not use by `garm`, but should be implemented.
|
||||||
|
|
||||||
The ```Stop``` operation will stop the virtual machine in the selected cloud.
|
The `Stop` operation will stop the virtual machine in the selected cloud.
|
||||||
|
|
||||||
Available environment variables:
|
Available environment variables:
|
||||||
|
|
||||||
|
|
|
||||||
4
doc/images/garm-dark.drawio.svg
Normal file
4
doc/images/garm-dark.drawio.svg
Normal file
File diff suppressed because one or more lines are too long
|
After Width: | Height: | Size: 92 KiB |
4
doc/images/garm-light.drawio.svg
Normal file
4
doc/images/garm-light.drawio.svg
Normal file
File diff suppressed because one or more lines are too long
|
After Width: | Height: | Size: 91 KiB |
|
|
@ -1,64 +0,0 @@
|
||||||
# Provider configuration
|
|
||||||
|
|
||||||
GARM was designed to be extensible. Providers can be written as external executables which implement the needed interface to create/delete/list compute systems that are used by ```GARM``` to create runners.
|
|
||||||
|
|
||||||
- [Providers](#providers)
|
|
||||||
- [Available external providers](#available-external-providers)
|
|
||||||
|
|
||||||
## Providers
|
|
||||||
|
|
||||||
GARM delegates the functionality needed to create the runners to external executables. These executables can be either binaries or scripts. As long as they adhere to the needed interface, they can be used to create runners in any target IaaS. You might find this behavior familiar if you've ever had to deal with installing `CNIs` in `containerd`. The principle is the same.
|
|
||||||
|
|
||||||
The configuration for an external provider is quite simple:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
# This is an example external provider. External providers are executables that
|
|
||||||
# implement the needed interface to create/delete/list compute systems that are used
|
|
||||||
# by GARM to create runners.
|
|
||||||
[[provider]]
|
|
||||||
name = "openstack_external"
|
|
||||||
description = "external openstack provider"
|
|
||||||
provider_type = "external"
|
|
||||||
[provider.external]
|
|
||||||
# config file passed to the executable via GARM_PROVIDER_CONFIG_FILE environment variable
|
|
||||||
config_file = "/etc/garm/providers.d/openstack/keystonerc"
|
|
||||||
# Absolute path to an executable that implements the provider logic. This executable can be
|
|
||||||
# anything (bash, a binary, python, etc). See documentation in this repo on how to write an
|
|
||||||
# external provider.
|
|
||||||
provider_executable = "/etc/garm/providers.d/openstack/garm-external-provider"
|
|
||||||
# This option will pass all environment variables that start with AWS_ to the provider.
|
|
||||||
# To pass in individual variables, you can add the entire name to the list.
|
|
||||||
environment_variables = ["AWS_"]
|
|
||||||
```
|
|
||||||
|
|
||||||
The external provider has three options:
|
|
||||||
|
|
||||||
* `provider_executable`
|
|
||||||
* `config_file`
|
|
||||||
* `environment_variables`
|
|
||||||
|
|
||||||
The ```provider_executable``` option is the absolute path to an executable that implements the provider logic. GARM will delegate all provider operations to this executable. This executable can be anything (bash, python, perl, go, etc). See [Writing an external provider](./external_provider.md) for more details.
|
|
||||||
|
|
||||||
The ```config_file``` option is a path on disk to an arbitrary file, that is passed to the external executable via the environment variable ```GARM_PROVIDER_CONFIG_FILE```. This file is only relevant to the external provider. GARM itself does not read it. In the case of the sample OpenStack provider, this file contains access information for an OpenStack cloud (what you would typically find in a ```keystonerc``` file) as well as some provider specific options like whether or not to boot from volume and which tenant network to use. You can check out the [sample config file](../contrib/providers.d/openstack/keystonerc) in this repository.
|
|
||||||
|
|
||||||
The `environment_variables` option is a list of environment variables that will be passed to the external provider. By default GARM will pass a clean env to providers, consisting only of variables that the [provider interface](./external_provider.md) expects. However, in some situations, provider may need access to certain environment variables set in the env of GARM itself. This might be needed to enable access to IAM roles (ec2) or managed identity (azure). This option takes a list of environment variables or prefixes of environment variables that will be passed to the provider. For example, if you want to pass all environment variables that start with `AWS_` to the provider, you can set this option to `["AWS_"]`.
|
|
||||||
|
|
||||||
If you want to implement an external provider, you can use this file for anything you need to pass into the binary when ```GARM``` calls it to execute a particular operation.
|
|
||||||
|
|
||||||
### Available external providers
|
|
||||||
|
|
||||||
For non testing purposes, there are two external providers currently available:
|
|
||||||
|
|
||||||
* [OpenStack](https://github.com/cloudbase/garm-provider-openstack)
|
|
||||||
* [Azure](https://github.com/cloudbase/garm-provider-azure)
|
|
||||||
* [Kubernetes](https://github.com/mercedes-benz/garm-provider-k8s) - Thanks to the amazing folks at @mercedes-benz for sharing their awesome provider!
|
|
||||||
* [LXD](https://github.com/cloudbase/garm-provider-lxd)
|
|
||||||
* [Incus](https://github.com/cloudbase/garm-provider-incus)
|
|
||||||
* [Equinix Metal](https://github.com/cloudbase/garm-provider-equinix)
|
|
||||||
* [Amazon EC2](https://github.com/cloudbase/garm-provider-aws)
|
|
||||||
* [Google Cloud Platform (GCP)](https://github.com/cloudbase/garm-provider-gcp)
|
|
||||||
* [Oracle Cloud Infrastructure (OCI)](https://github.com/cloudbase/garm-provider-oci)
|
|
||||||
|
|
||||||
Details on how to install and configure them are available in their respective repositories.
|
|
||||||
|
|
||||||
If you wrote a provider and would like to add it to the above list, feel free to open a PR.
|
|
||||||
|
|
@ -2,19 +2,18 @@
|
||||||
|
|
||||||
<!-- TOC -->
|
<!-- TOC -->
|
||||||
|
|
||||||
- [Quick start](#quick-start)
|
- [Create the config folder](#create-the-config-folder)
|
||||||
- [Create the config folder](#create-the-config-folder)
|
- [The config file](#the-config-file)
|
||||||
- [The config file](#the-config-file)
|
- [The provider section](#the-provider-section)
|
||||||
- [The provider section](#the-provider-section)
|
- [Starting the service](#starting-the-service)
|
||||||
- [Starting the service](#starting-the-service)
|
- [Using Docker](#using-docker)
|
||||||
- [Using Docker](#using-docker)
|
- [Setting up GARM as a system service](#setting-up-garm-as-a-system-service)
|
||||||
- [Setting up GARM as a system service](#setting-up-garm-as-a-system-service)
|
- [Initializing GARM](#initializing-garm)
|
||||||
- [Initializing GARM](#initializing-garm)
|
- [Setting up the webhook](#setting-up-the-webhook)
|
||||||
- [Setting up the webhook](#setting-up-the-webhook)
|
- [Creating a GitHub endpoint Optional](#creating-a-github-endpoint-optional)
|
||||||
- [Creating a GitHub endpoint Optional](#creating-a-github-endpoint-optional)
|
- [Adding credentials](#adding-credentials)
|
||||||
- [Adding credentials](#adding-credentials)
|
- [Define a repo](#define-a-repo)
|
||||||
- [Define a repo](#define-a-repo)
|
- [Create a pool](#create-a-pool)
|
||||||
- [Create a pool](#create-a-pool)
|
|
||||||
|
|
||||||
<!-- /TOC -->
|
<!-- /TOC -->
|
||||||
|
|
||||||
|
|
@ -26,11 +25,11 @@ All of our config files and data will be stored in `/etc/garm`. Let's create tha
|
||||||
sudo mkdir -p /etc/garm
|
sudo mkdir -p /etc/garm
|
||||||
```
|
```
|
||||||
|
|
||||||
Coincidentally, this is also where the docker container [looks for the config](../Dockerfile#L29) when it starts up. You can either use `Docker` or you can set up garm directly on your system. We'll walk you through both options. In both cases, we need to first create the config folder and a proper config file.
|
Coincidentally, this is also where the docker container [looks for the config](/Dockerfile#L29) when it starts up. You can either use `Docker` or you can set up garm directly on your system. We'll walk you through both options. In both cases, we need to first create the config folder and a proper config file.
|
||||||
|
|
||||||
## The config file
|
## The config file
|
||||||
|
|
||||||
There is a full config file, with detailed comments for each option, in the [testdata folder](../testdata/config.toml). You can use that as a reference. But for the purposes of this guide, we'll be using a minimal config file and add things on as we proceed.
|
There is a full config file, with detailed comments for each option, in the [testdata folder](/testdata/config.toml). You can use that as a reference. But for the purposes of this guide, we'll be using a minimal config file and add things on as we proceed.
|
||||||
|
|
||||||
Open `/etc/garm/config.toml` in your favorite editor and paste the following:
|
Open `/etc/garm/config.toml` in your favorite editor and paste the following:
|
||||||
|
|
||||||
|
|
@ -71,7 +70,7 @@ time_to_live = "8760h"
|
||||||
db_file = "/etc/garm/garm.db"
|
db_file = "/etc/garm/garm.db"
|
||||||
```
|
```
|
||||||
|
|
||||||
This is a minimal config, with no providers defined. In this example we have the [default](./config_default.md), [logging](./config_logging.md), [metrics](./config_metrics.md), [jwt_auth](./config_jwt_auth.md), [apiserver](./config_api_server.md) and [database](./database.md) sections. Each are documented separately. Feel free to read through the available docs if, for example you need to enable TLS without using an nginx reverse proxy or if you want to enable the debug server, the log streamer or a log file.
|
This is a minimal config, with no providers defined. In this example we have the [default](/doc/config.md#the-default-config-section), [logging](/doc/config.md#the-logging-section), [metrics](/doc/config.md#the-metrics-section), [jwt_auth](/doc/config.md#the-jwt-authentication-config-section), [apiserver](/doc/config.md#the-api-server-config-section) and [database](/doc/config.md#database-configuration) sections. Each are documented separately. Feel free to read through the available docs if, for example you need to enable TLS without using an nginx reverse proxy or if you want to enable the debug server, the log streamer or a log file.
|
||||||
|
|
||||||
In this sample config we:
|
In this sample config we:
|
||||||
|
|
||||||
|
|
@ -295,9 +294,9 @@ garm-cli profile switch another_garm
|
||||||
|
|
||||||
## Setting up the webhook
|
## Setting up the webhook
|
||||||
|
|
||||||
There are two options when it comes to setting up the webhook in GitHub. You can manually set up the webhook in the GitHub UI, and then use the resulting secret when creating the entity (repo, org, enterprise), or you can let GARM do it automatically if the app or PAT you're using has the [required privileges](./github_credentials.md).
|
There are two options when it comes to setting up the webhook in GitHub. You can manually set up the webhook in the GitHub UI, and then use the resulting secret when creating the entity (repo, org, enterprise), or you can let GARM do it automatically if the app or PAT you're using has the [required privileges](/doc/github_credentials.md).
|
||||||
|
|
||||||
If you want to manually set up the webhooks, have a look at the [webhooks doc](./webhooks.md) for more information.
|
If you want to manually set up the webhooks, have a look at the [webhooks doc](/doc/webhooks.md) for more information.
|
||||||
|
|
||||||
In this guide, I'll show you how to do it automatically when adding a new repo, assuming you have the required privileges. Note, you'll still have to manually set up webhooks if you want to use GARM at the enterprise level. Automatic webhook management is only available for repos and orgs.
|
In this guide, I'll show you how to do it automatically when adding a new repo, assuming you have the required privileges. Note, you'll still have to manually set up webhooks if you want to use GARM at the enterprise level. Automatic webhook management is only available for repos and orgs.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -116,7 +116,7 @@ GARM uses providers to create runners. These providers are external executables
|
||||||
|
|
||||||
### Listing configured providers
|
### Listing configured providers
|
||||||
|
|
||||||
Once configured (see [provider configuration](/doc/providers.md)), you can list the configured providers by running the following command:
|
Once configured (see [provider configuration](/doc/config.md#providers)), you can list the configured providers by running the following command:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ubuntu@garm:~$ garm-cli provider list
|
ubuntu@garm:~$ garm-cli provider list
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue