diff --git a/content/en/docs/Autonomous UAT Agent/_index.md b/content/en/docs/Autonomous UAT Agent/_index.md index 0800a62..03b6689 100644 --- a/content/en/docs/Autonomous UAT Agent/_index.md +++ b/content/en/docs/Autonomous UAT Agent/_index.md @@ -1,12 +1,10 @@ --- title: "Autonomous UAT Agent" -linkTitle: "autonomous-uat-agent" +linkTitle: "Autonomous UAT Agent" weight: 10 description: > General documentation for D66 and the Autonomous UAT Agent --- -# General Documentation (D66) - This section contains the core documentation for D66, focusing on how the Autonomous UAT Agent works and how to run it. diff --git a/content/en/docs/Autonomous UAT Agent/agent-workflow-diagram.md b/content/en/docs/Autonomous UAT Agent/agent-workflow-diagram.md index ff85f0d..dd57d5d 100644 --- a/content/en/docs/Autonomous UAT Agent/agent-workflow-diagram.md +++ b/content/en/docs/Autonomous UAT Agent/agent-workflow-diagram.md @@ -5,9 +5,6 @@ weight: 5 description: > Visual workflow of a typical Agent S (Autonomous UAT Agent) run (gui_agent_cli.py) across Ministral, Holo, and VNC --- - -# Agent Workflow Diagram (Autonomous UAT Agent) - This page provides a **visual sketch** of the typical workflow (example: `gui_agent_cli.py`). ## Workflow (fallback without Mermaid) diff --git a/content/en/docs/Autonomous UAT Agent/model-stack.md b/content/en/docs/Autonomous UAT Agent/model-stack.md index d0589a6..557d30b 100644 --- a/content/en/docs/Autonomous UAT Agent/model-stack.md +++ b/content/en/docs/Autonomous UAT Agent/model-stack.md @@ -6,8 +6,6 @@ description: > Thinking vs grounding model split for D66 (current state and target state) --- -# Model Stack - For a visual overview of how the models interact with the VNC-based GUI automation loop, see: [Workflow Diagram](./agent-workflow-diagram.md) ## Requirement diff --git a/content/en/docs/Autonomous UAT Agent/results/_index.md b/content/en/docs/Autonomous UAT Agent/results/_index.md index 8de5742..2c5e733 100644 --- a/content/en/docs/Autonomous UAT Agent/results/_index.md +++ b/content/en/docs/Autonomous UAT Agent/results/_index.md @@ -5,9 +5,6 @@ weight: 20 description: > Results, findings, and evidence artifacts for D66 --- - -# Results & Findings (D66) - This section contains the outputs that support D66 claims: findings summaries and pointers to logs, screenshots, and run artifacts. ## Pages diff --git a/content/en/docs/Autonomous UAT Agent/results/golden-run-telekom-header-nav/index.md b/content/en/docs/Autonomous UAT Agent/results/golden-run-telekom-header-nav/index.md index fd2b196..f6990f3 100644 --- a/content/en/docs/Autonomous UAT Agent/results/golden-run-telekom-header-nav/index.md +++ b/content/en/docs/Autonomous UAT Agent/results/golden-run-telekom-header-nav/index.md @@ -6,8 +6,6 @@ description: > Evidence pack (screenshots + logs) for the golden run on www.telekom.de header navigation --- -# Golden Run: Telekom Header Navigation - This page is the evidence pack for the **Autonomous UAT Agent** golden run on **www.telekom.de**. ## Run intent diff --git a/content/en/docs/Autonomous UAT Agent/results/logs-and-artifacts.md b/content/en/docs/Autonomous UAT Agent/results/logs-and-artifacts.md index cee348f..d2c740e 100644 --- a/content/en/docs/Autonomous UAT Agent/results/logs-and-artifacts.md +++ b/content/en/docs/Autonomous UAT Agent/results/logs-and-artifacts.md @@ -6,8 +6,6 @@ description: > Where to find logs, screenshots, and reports relevant to D66 --- -# Logs & Artifacts - ## Repo locations - Local calibration and run logs: `logs/` diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation.md index f2a345a..de96342 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation.md @@ -5,9 +5,6 @@ weight: 1 description: > What was validated and where to find the evidence --- - -# PoC Validation Evidence - ## What was validated - Autonomous GUI interaction via the Autonomous UAT Agent (Agent S3-based scripts) diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Agent S2.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Agent S2.md index 2c21b9e..f70341f 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Agent S2.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Agent S2.md @@ -7,8 +7,6 @@ description: > Roh-Ergebnisse (PoC Validation) für Use Case 1 – Agent S2 --- -# Use Case 1 – Ergebnisse Agent S2 - ## Kontext - Use Case: **Funktionale Korrektheit von UI-Elementen** diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Anthropic Computer Use.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Anthropic Computer Use.md index fe11c9f..4ea639d 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Anthropic Computer Use.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/1- Funktionale Korrektheit von UI-Elementen/1 - Ergebnisse Anthropic Computer Use.md @@ -7,8 +7,6 @@ description: > Roh-Ergebnisse (PoC Validation) für Use Case 1 – Anthropic Computer Use --- -# Use Case 1 – Ergebnisse Anthropic Computer Use - ## Kontext - Use Case: **Funktionale Korrektheit von UI-Elementen** diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/3 - Task-based UX-Analyse/3 - Experteneinschätzung zum Vergleich der Agent Frameworks.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/3 - Task-based UX-Analyse/3 - Experteneinschätzung zum Vergleich der Agent Frameworks.md index 896071f..2220d8a 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/3 - Task-based UX-Analyse/3 - Experteneinschätzung zum Vergleich der Agent Frameworks.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/POC Validation Confluence docs/3 - Task-based UX-Analyse/3 - Experteneinschätzung zum Vergleich der Agent Frameworks.md @@ -6,9 +6,6 @@ draft: true description: > Vergleich Agent S2 vs. Anthropic Computer Use (Task-based UX-Analyse) --- - -# Use Case 3 – Experteneinschätzung (Agent Frameworks) - - Beide KI Agenten erledigen die Aufgabe. - Während Agent S2 eher technische Empfehlungen zur Optimierung gibt, fokussiert sich Anthropic Computer Use auf Usability Optimierungen. - Tatsächlich muss man erstmal scrollen, um den Button für das Newsletter Abo zu finden. Annahme, dass hier reale Nutzer bereits etwas mit der Suche beschäftigt wären. diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/_index.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/_index.md index 966d717..1f94994 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/_index.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/_index.md @@ -5,9 +5,6 @@ weight: 1 description: > What was validated in the PoC and where to find the evidence --- - -# PoC Validation Evidence - This page summarizes what was validated in the Proof of Concept (PoC) for the **Autonomous UAT Agent** and where to find supporting evidence. ## Scope and objective diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-1-functional-correctness/index.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-1-functional-correctness/index.md index 4bc0d69..a8db9ae 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-1-functional-correctness/index.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-1-functional-correctness/index.md @@ -5,9 +5,6 @@ weight: 10 description: > Legacy PoC run executed with the Anthropic API and Claude Sonnet 4.5 to validate functional UI correctness checks --- - -# Run 1 – Functional Correctness (Legacy PoC) - ## Purpose This run demonstrates how an autonomous agent can systematically validate the **functional correctness** of UI elements (e.g., links, buttons, menus, dialogs, forms) and produce a structured result set. diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-2-ux-health-check/index.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-2-ux-health-check/index.md index 88de621..fe71b41 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-2-ux-health-check/index.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-2-ux-health-check/index.md @@ -5,9 +5,6 @@ weight: 20 description: > Legacy PoC run executed with the Anthropic API and Claude Sonnet 4.5 to assess visual quality and UX consistency --- - -# Run 2 – Visual Quality & Consistency (UX Health Check) (Legacy PoC) - ## Purpose This run demonstrates how an autonomous agent can perform a lightweight **UX health check** focused on visual quality and consistency, and turn observations into actionable recommendations. diff --git a/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-3-task-based-ux-analysis/index.md b/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-3-task-based-ux-analysis/index.md index 6c781d0..3753746 100644 --- a/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-3-task-based-ux-analysis/index.md +++ b/content/en/docs/Autonomous UAT Agent/results/poc-validation/run-3-task-based-ux-analysis/index.md @@ -5,9 +5,6 @@ weight: 30 description: > Legacy PoC run executed with the Anthropic API and Claude Sonnet 4.5 to analyze end-to-end task flows and usability friction --- - -# Run 3 – Task-based UX Analysis (Legacy PoC) - ## Purpose This run demonstrates how an autonomous agent can execute a representative **end-to-end user task** and produce an analysis of usability and experience quality. diff --git a/content/en/docs/Autonomous UAT Agent/running-auata-scripts.md b/content/en/docs/Autonomous UAT Agent/running-auata-scripts.md index 3408f3f..c728f80 100644 --- a/content/en/docs/Autonomous UAT Agent/running-auata-scripts.md +++ b/content/en/docs/Autonomous UAT Agent/running-auata-scripts.md @@ -6,8 +6,6 @@ description: > How to run the key D66 evaluation scripts and what they produce --- -# Running Autonomous UAT Agent Scripts - The **Autonomous UAT Agent** is the overall UX/UI testing use case built on top of the Agent S codebase and scripts in this repo. All commands below assume you are running from the **Agent-S repository root** (Linux/ECS), `~/Projects/Agent_S3/Agent-S`. To do that, connect to the server via SSH. You will need a key pair for authentication and an open inbound port in the firewall. For information on how to obtain the key pair and request firewall access, contact [tom.sakretz@telekom.de](mailto:tom.sakretz@telekom.de).