Fix instance JWT token expiration

The instance JWT token expiration time was set at 15 minutes, regardless
of bootstrap timeout. This meant that instances that take longer than 15
minutes, would not be able to send their status updates and github agent
ID back to garm.

Signed-off-by: Gabriel Adrian Samfira <gsamfira@cloudbasesolutions.com>
This commit is contained in:
Gabriel Adrian Samfira 2022-07-17 07:21:20 +00:00
parent 5d02de92d3
commit 067197c1b5
4 changed files with 24 additions and 17 deletions

View file

@ -43,9 +43,11 @@ type InstanceJWTClaims struct {
jwt.StandardClaims
}
func NewInstanceJWTToken(instance params.Instance, secret, entity string, poolType common.PoolType) (string, error) {
// make TTL configurable?
expireToken := time.Now().Add(15 * time.Minute).Unix()
func NewInstanceJWTToken(instance params.Instance, secret, entity string, poolType common.PoolType, ttlMinutes uint) (string, error) {
// Token expiration is equal to the bootstrap timeout set on the pool plus the polling
// interval garm uses to check for timed out runners. Runners that have not sent their info
// by the end of this interval are most likely failed and will be reaped by garm anyway.
expireToken := time.Now().Add(time.Duration(ttlMinutes)*time.Minute + common.PoolReapTimeoutInterval).Unix()
claims := InstanceJWTClaims{
StandardClaims: jwt.StandardClaims{
ExpiresAt: expireToken,

View file

@ -21,7 +21,7 @@ In the webhook configuration page under ```Content type``` you will need to sele
The webhook secret must be secure. Use something like this to generate one:
```bash
gabriel@rossak:~$ function generate_secret () {
gabriel@rossak:~$ function generate_secret () {
tr -dc 'a-zA-Z0-9!@#$%^&*()_+?><~\`;' < /dev/urandom | head -c 64;
echo ''
}
@ -35,19 +35,13 @@ Next, you can choose which events GitHub should send to ```garm``` via webhooks.
## The callback_url option
If you want your runners to be able to call back home and update their status as they install, you will need to configure the ```callback_url``` option in the ```garm``` server config. This URL needs to point to the following API endpoint:
Your runners will call back home with status updates as they install. Once they are set up, they will also send the GitHub agent ID they were allocated. You will need to configure the ```callback_url``` option in the ```garm``` server config. This URL needs to point to the following API endpoint:
```
POST /api/v1/callbacks/status
```
While not critical, this allows instances to call back home, set their own status as installation procedes and send back messages which can be viewed by running:
```bash
garm-cli runner show <runner_name>
```
For example:
Example of a runner sending status updates:
```bash
garm-cli runner show garm-f5227755-129d-4e2d-b306-377a8f3a5dfe
@ -82,6 +76,6 @@ For example, in a scenario where you expose the API endpoint directly, this sett
callback_url = "https://garm.example.com/api/v1/callbacks/status"
```
Authentication is done using a short-lived (15 minutes) JWT token, that gets generated for a particular instance that we are spinning up. That JWT token only has access to update it's own status. No other API endpoints will work with that JWT token.
Authentication is done using a short-lived JWT token, that gets generated for a particular instance that we are spinning up. That JWT token only has access to update it's own status. No other API endpoints will work with that JWT token. The validity of the token is equal to the pool bootstrap timeout value (default 20 minutes) plus the garm polling interval (5 minutes).
There is a sample ```nginx``` config [in the testdata folder](/testdata/nginx-server.conf). Feel free to customize it whichever way you see fit.

View file

@ -16,6 +16,7 @@ package common
import (
"garm/params"
"time"
)
type PoolType string
@ -23,6 +24,10 @@ type PoolType string
const (
RepositoryPool PoolType = "repository"
OrganizationPool PoolType = "organization"
PoolConsilitationInterval = 5 * time.Second
PoolReapTimeoutInterval = 5 * time.Minute
PoolToolUpdateInterval = 3 * time.Hour
)
type PoolManager interface {

View file

@ -334,9 +334,9 @@ func (r *basePool) AddRunner(ctx context.Context, poolID string) error {
}
func (r *basePool) loop() {
consolidateTimer := time.NewTicker(5 * time.Second)
reapTimer := time.NewTicker(5 * time.Minute)
toolUpdateTimer := time.NewTicker(3 * time.Hour)
consolidateTimer := time.NewTicker(common.PoolConsilitationInterval)
reapTimer := time.NewTicker(common.PoolReapTimeoutInterval)
toolUpdateTimer := time.NewTicker(common.PoolToolUpdateInterval)
defer func() {
log.Printf("repository %s loop exited", r.helper.String())
consolidateTimer.Stop()
@ -416,8 +416,14 @@ func (r *basePool) addInstanceToProvider(instance params.Instance) error {
return errors.Wrap(err, "fetching registration token")
}
jwtValidity := pool.RunnerTimeout()
var poolType common.PoolType = common.RepositoryPool
if pool.OrgID != "" {
poolType = common.OrganizationPool
}
entity := r.helper.String()
jwtToken, err := auth.NewInstanceJWTToken(instance, r.helper.JwtToken(), entity, common.RepositoryPool)
jwtToken, err := auth.NewInstanceJWTToken(instance, r.helper.JwtToken(), entity, poolType, jwtValidity)
if err != nil {
return errors.Wrap(err, "fetching instance jwt token")
}