Commit graph

12 commits

Author SHA1 Message Date
eb01c1c842
fix(receiver): return Payload as JSON object instead of string
All checks were successful
ci / build (push) Successful in 1m49s
Changed the API response to embed Payload as a JSON object using
json.RawMessage instead of returning it as a JSON-encoded string
inside the JSON response. Added MetricResponse type with ToResponse
converter method.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 15:22:27 +01:00
0af8c28bc2
fix(aggregator): prevent CPU cores overflow when processes restart
All checks were successful
ci / build (push) Successful in 28s
Guard against unsigned integer underflow in cgroup CPU calculation.
When processes exit and new ones start, totalTicks can be less than
the previous value, causing the subtraction to wrap around to a huge
positive number.

Now checks totalTicks >= prev before calculating delta, treating
process churn as 0 CPU usage for that sample.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 15:15:30 +01:00
6770cfcea7
feat(summary): add per-container metrics with extended percentiles
All checks were successful
ci / build (push) Successful in 34s
- Extend StatSummary with p99, p75, p50 percentiles (in addition to peak, p95, avg)
- Add ContainerSummary type for per-container CPU cores and memory bytes stats
- Track container metrics from Cgroups map in Accumulator
- Include containers array in RunSummary sent to receiver

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 15:01:01 +01:00
5e470c33a5
feat(collector): group CPU and memory metrics by cgroup
All checks were successful
ci / build (push) Successful in 30s
Add cgroup-based process grouping to the resource collector. Processes are
grouped by their cgroup path, with container names resolved via configurable
process-to-container mapping.

New features:
- Read cgroup info from /proc/[pid]/cgroup (supports v1 and v2)
- Parse K8s resource notation (500m, 1Gi, etc.) for CPU/memory limits
- Group metrics by container using CGROUP_PROCESS_MAP env var
- Calculate usage percentages against limits from CGROUP_LIMITS env var
- Output cgroup metrics with CPU cores used, memory RSS, and percentages

Environment variables:
- CGROUP_PROCESS_MAP: Map process names to container names for discovery
- CGROUP_LIMITS: Define CPU/memory limits per container in K8s notation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 14:50:36 +01:00
0bf7dfee38
test: add integration tests for collector-receiver interaction
All checks were successful
ci / build (push) Successful in 38s
Add integration tests that verify the push client can successfully
send metrics to the receiver and they are stored correctly in SQLite.

Tests:
- TestPushClientToReceiver: Direct HTTP POST verification
- TestPushClientIntegration: Full PushClient with env vars
- TestMultiplePushes: Multiple pushes and filtering (parallel-safe)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 12:12:36 +01:00
7da7dc138f
refactor(receiver): change query endpoint to filter by workflow and job
All checks were successful
ci / build (push) Successful in 32s
Replace /api/v1/metrics/run/{runID} and /api/v1/metrics/repo/{org}/{repo}
with /api/v1/metrics/repo/{org}/{repo}/{workflow}/{job} for more precise
filtering by workflow and job name.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 12:00:22 +01:00
cfe583fbc4
feat(collector): add HTTP push for metrics to receiver
All checks were successful
ci / build (push) Successful in 46s
Add push client that sends run summary to a configurable HTTP endpoint
on shutdown. Execution context is read from GitHub Actions style
environment variables (with Gitea fallbacks).

New flag: -push-endpoint <url>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 11:44:20 +01:00
c309bd810d
feat(receiver): add HTTP metrics receiver with SQLite storage
All checks were successful
ci / build (push) Successful in 2m33s
Add a new receiver application under cmd/receiver that accepts metrics
via HTTP POST and stores them in SQLite using GORM. The receiver expects
GitHub Actions style execution context (org, repo, workflow, job, run_id).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 11:40:03 +01:00
c5c872a373
fix(output): correct JSON serialization of top process metrics
All checks were successful
ci / build (push) Successful in 1m56s
slog.Any() does not properly serialize slices of slog.Group() attributes,
resulting in broken output like {"Key":"","Value":{}}. Fixed by passing
structs with JSON tags directly to slog.Any() instead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-06 11:00:34 +01:00
7201a527d8
feat(collector): Summaries metrics at the end of the process
All checks were successful
ci / build (push) Successful in 1m39s
2026-02-04 16:21:17 +01:00
ddaf5fbd0f
chore: update module path to DevFW-CICD org
Some checks failed
ci / goreleaser (push) Failing after 1s
Rename module from edp.buildth.ing/DevFW/forgejo-runner-resource-collector
to edp.buildth.ing/DevFW-CICD/forgejo-runner-resource-collector
2026-02-04 14:25:50 +01:00
219d26959f
feat: add resource collector for Forgejo runners
Add Go application that collects CPU and RAM metrics from /proc filesystem:
- Parse /proc/[pid]/stat for CPU usage (user/system time)
- Parse /proc/[pid]/status for memory usage (RSS, VmSize, etc.)
- Aggregate metrics across all processes
- Output via structured logging (JSON/text)
- Continuous collection with configurable interval

Designed for monitoring pipeline runner resource utilization to enable
dynamic runner sizing.
2026-02-04 14:13:24 +01:00