forgejo-runner-optimiser/CLAUDE.md
Martin McCaffery d0aea88a5b
Some checks failed
ci / build (push) Failing after 58s
refactor: Rename recommender to sizer
2026-02-13 16:42:37 +01:00

3.7 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Build and Development Commands

# Build
make build                    # Build the collector binary
go build -o collector ./cmd/collector
go build -o receiver ./cmd/receiver

# Test
make test                     # Run all tests
go test -v ./...              # Run all tests with verbose output
go test -v ./internal/collector/...  # Run tests for a specific package
make test-coverage            # Run tests with coverage report

# Code Quality
make fmt                      # Format code
make vet                      # Run go vet
make lint                     # Run golangci-lint (v2.6.2)
make all                      # Format, vet, lint, and build

# Git Hooks
make install-hooks            # Install pre-commit and commit-msg hooks

Architecture Overview

A resource optimiser for CI/CD environments with shared PID namespaces. It consists of two binaries — a collector and a receiver (which includes the sizer):

Collector (cmd/collector)

Runs alongside CI workloads, periodically reads /proc filesystem, and pushes a summary to the receiver on shutdown (SIGINT/SIGTERM).

Data Flow:

  1. metrics.Aggregator reads /proc to collect CPU/memory for all processes
  2. collector.Collector orchestrates collection at intervals and writes to output
  3. summary.Accumulator tracks samples across the run, computing peak/avg/percentiles
  4. On shutdown, summary.PushClient sends the summary to the receiver HTTP endpoint

Receiver (cmd/receiver)

HTTP service that stores metric summaries in SQLite (via GORM), provides a query API, and includes the sizer — which computes right-sized Kubernetes resource requests and limits from historical data.

Key Endpoints:

  • POST /api/v1/metrics - Receive metrics from collectors
  • GET /api/v1/metrics/repo/{org}/{repo}/{workflow}/{job} - Query stored metrics
  • GET /api/v1/sizing/repo/{org}/{repo}/{workflow}/{job} - Compute container sizes from historical data

Internal Packages

Package Purpose
internal/collector Orchestrates collection loop, handles shutdown
internal/metrics Aggregates system/process metrics from /proc
internal/proc Low-level /proc parsing (stat, status, cgroup)
internal/cgroup Parses CGROUP_LIMITS and CGROUP_PROCESS_MAP env vars
internal/summary Accumulates samples, computes stats, pushes to receiver
internal/receiver HTTP handlers, SQLite store, and sizer logic
internal/output Metrics output formatting (JSON/text)

Container Metrics

The collector groups processes by container using cgroup paths. Configuration via environment variables:

  • CGROUP_PROCESS_MAP: JSON mapping process names to container names (e.g., {"node":"runner"})
  • CGROUP_LIMITS: JSON with CPU/memory limits per container for percentage calculations

CPU values in container metrics are reported as cores (not percentage), enabling direct comparison with Kubernetes resource limits.

Commit Message Convention

Uses conventional commits enforced by git hook:

<type>(<scope>)?: <description>

Types: feat, fix, chore, docs, style, refactor, perf, test, build, ci

Examples:

  • feat: add user authentication
  • fix(collector): handle nil cgroup paths
  • feat!: breaking change in API

Running Locally

# Run receiver
./receiver --addr=:8080 --db=metrics.db

# Run collector with push endpoint
./collector --interval=2s --top=10 --push-endpoint=http://localhost:8080/api/v1/metrics

# Docker Compose stress test
docker compose -f test/docker/docker-compose-stress.yaml up -d