Fix duplicate titles in Autonomous UAT Agent docs

This commit is contained in:
Tom Sakretz 2026-02-04 10:57:28 +01:00
parent 5e4ef8df05
commit 6276f75216
15 changed files with 1 additions and 39 deletions

View file

@ -1,12 +1,10 @@
---
title: "Autonomous UAT Agent"
linkTitle: "autonomous-uat-agent"
linkTitle: "Autonomous UAT Agent"
weight: 10
description: >
General documentation for D66 and the Autonomous UAT Agent
---
# General Documentation (D66)
This section contains the core documentation for D66, focusing on how the Autonomous UAT Agent works and how to run it.

View file

@ -5,9 +5,6 @@ weight: 5
description: >
Visual workflow of a typical Agent S (Autonomous UAT Agent) run (gui_agent_cli.py) across Ministral, Holo, and VNC
---
# Agent Workflow Diagram (Autonomous UAT Agent)
This page provides a **visual sketch** of the typical workflow (example: `gui_agent_cli.py`).
## Workflow (fallback without Mermaid)

View file

@ -6,8 +6,6 @@ description: >
Thinking vs grounding model split for D66 (current state and target state)
---
# Model Stack
For a visual overview of how the models interact with the VNC-based GUI automation loop, see: [Workflow Diagram](./agent-workflow-diagram.md)
## Requirement

View file

@ -5,9 +5,6 @@ weight: 20
description: >
Results, findings, and evidence artifacts for D66
---
# Results & Findings (D66)
This section contains the outputs that support D66 claims: findings summaries and pointers to logs, screenshots, and run artifacts.
## Pages

View file

@ -6,8 +6,6 @@ description: >
Evidence pack (screenshots + logs) for the golden run on www.telekom.de header navigation
---
# Golden Run: Telekom Header Navigation
This page is the evidence pack for the **Autonomous UAT Agent** golden run on **www.telekom.de**.
## Run intent

View file

@ -6,8 +6,6 @@ description: >
Where to find logs, screenshots, and reports relevant to D66
---
# Logs & Artifacts
## Repo locations
- Local calibration and run logs: `logs/`

View file

@ -5,9 +5,6 @@ weight: 1
description: >
What was validated and where to find the evidence
---
# PoC Validation Evidence
## What was validated
- Autonomous GUI interaction via the Autonomous UAT Agent (Agent S3-based scripts)

View file

@ -7,8 +7,6 @@ description: >
Roh-Ergebnisse (PoC Validation) für Use Case 1 Agent S2
---
# Use Case 1 Ergebnisse Agent S2
## Kontext
- Use Case: **Funktionale Korrektheit von UI-Elementen**

View file

@ -7,8 +7,6 @@ description: >
Roh-Ergebnisse (PoC Validation) für Use Case 1 Anthropic Computer Use
---
# Use Case 1 Ergebnisse Anthropic Computer Use
## Kontext
- Use Case: **Funktionale Korrektheit von UI-Elementen**

View file

@ -6,9 +6,6 @@ draft: true
description: >
Vergleich Agent S2 vs. Anthropic Computer Use (Task-based UX-Analyse)
---
# Use Case 3 Experteneinschätzung (Agent Frameworks)
- Beide KI Agenten erledigen die Aufgabe.
- Während Agent S2 eher technische Empfehlungen zur Optimierung gibt, fokussiert sich Anthropic Computer Use auf Usability Optimierungen.
- Tatsächlich muss man erstmal scrollen, um den Button für das Newsletter Abo zu finden. Annahme, dass hier reale Nutzer bereits etwas mit der Suche beschäftigt wären.

View file

@ -5,9 +5,6 @@ weight: 1
description: >
What was validated in the PoC and where to find the evidence
---
# PoC Validation Evidence
This page summarizes what was validated in the Proof of Concept (PoC) for the **Autonomous UAT Agent** and where to find supporting evidence.
## Scope and objective

View file

@ -5,9 +5,6 @@ weight: 10
description: >
Legacy PoC run executed with the Anthropic API and Claude Sonnet 4.5 to validate functional UI correctness checks
---
# Run 1 Functional Correctness (Legacy PoC)
## Purpose
This run demonstrates how an autonomous agent can systematically validate the **functional correctness** of UI elements (e.g., links, buttons, menus, dialogs, forms) and produce a structured result set.

View file

@ -5,9 +5,6 @@ weight: 20
description: >
Legacy PoC run executed with the Anthropic API and Claude Sonnet 4.5 to assess visual quality and UX consistency
---
# Run 2 Visual Quality & Consistency (UX Health Check) (Legacy PoC)
## Purpose
This run demonstrates how an autonomous agent can perform a lightweight **UX health check** focused on visual quality and consistency, and turn observations into actionable recommendations.

View file

@ -5,9 +5,6 @@ weight: 30
description: >
Legacy PoC run executed with the Anthropic API and Claude Sonnet 4.5 to analyze end-to-end task flows and usability friction
---
# Run 3 Task-based UX Analysis (Legacy PoC)
## Purpose
This run demonstrates how an autonomous agent can execute a representative **end-to-end user task** and produce an analysis of usability and experience quality.

View file

@ -6,8 +6,6 @@ description: >
How to run the key D66 evaluation scripts and what they produce
---
# Running Autonomous UAT Agent Scripts
The **Autonomous UAT Agent** is the overall UX/UI testing use case built on top of the Agent S codebase and scripts in this repo.
All commands below assume you are running from the **Agent-S repository root** (Linux/ECS), `~/Projects/Agent_S3/Agent-S`. To do that, connect to the server via SSH. You will need a key pair for authentication and an open inbound port in the firewall. For information on how to obtain the key pair and request firewall access, contact [tom.sakretz@telekom.de](mailto:tom.sakretz@telekom.de).