Minor updates to existing pages

2026-01-30 15:23:28 +01:00 · 2026-01-30 15:23:28 +01:00 · 097724558a
commit 097724558a
parent 242a4b8a79
4 changed files with 51 additions and 58 deletions
--- a/content/en/docs/Autonomous
+++ b/content/en/docs/Autonomous
@ -10,12 +10,3 @@ description: >

 This section contains the core documentation for D66, focusing on how the Autonomous UAT Agent works and how to run it.

-## Pages
-
- [Overview](./overview.md)
- [Quickstart](./quickstart.md)
- [Running Autonomous UAT Agent Scripts](./running-auata-scripts.md)
- [Workflow Diagram](./agent-workflow-diagram.md)
- [Model Stack](./model-stack.md)
- [Outputs & Artifacts](./outputs-and-artifacts.md)
- [Troubleshooting](./troubleshooting.md)
--- a/Agent/agent-workflow-diagram.md
+++ b/Agent/agent-workflow-diagram.md
@ -10,42 +10,68 @@ description: >

 This page provides a **visual sketch** of the typical workflow (example: `gui_agent_cli.py`).

+## Workflow (fallback without Mermaid)
+
+If Mermaid rendering is not available or fails in your build, this section shows the same workflow as plain text.
+
+```text
+Operator/Prompt
+  -> gui_agent_cli.py
+     -> (1) Planning request  -> Ministral vLLM (thinking)
+     <-     Next action intent
+     -> (2) Screenshot capture -> VNC Desktop / Firefox
+     <-     PNG screenshot
+     -> (3) Grounding request  -> Holo vLLM (vision)
+     <-     Coordinates + element metadata
+     -> (4) Execute action     -> VNC Desktop / Firefox
+     -> Artifacts saved        -> results/ (logs, screenshots, JSON)
+```
+
+| Step | From | To | What | Output |
+|---:|---|---|---|---|
+| 0 | Operator | gui_agent_cli.py | Provide goal / prompt | Goal text |
+| 1 | gui_agent_cli.py | Ministral vLLM | Plan next step (text) | Next action intent |
+| 2 | gui_agent_cli.py | VNC Desktop | Capture screenshot | PNG screenshot |
+| 3 | gui_agent_cli.py | Holo vLLM | Ground UI element(s) | Coordinates + element metadata |
+| 4 | gui_agent_cli.py | VNC Desktop | Execute click/type/scroll | UI state change |
+| 5 | gui_agent_cli.py | results/ | Persist evidence | Logs + screenshots + JSON |
+
 ## High-level data flow

 ```mermaid
 flowchart LR
  %% Left-to-right overview of one typical agent loop

-  user[Operator / Prompt] --> cli[Agent S script (gui_agent_cli.py)]
+  user[Operator / Prompt] --> cli[Agent S script<br/>gui_agent_cli.py]

-  subgraph otc[OTC (Open Telekom Cloud)]
-    subgraph ecsMin["ecs_ministral_L4 (164.30.28.242:8001) - Ministral vLLM"]
-      ministral[(Ministral 3 8B - Thinking / Planning)]
+  subgraph OTC[OTC (Open Telekom Cloud)]
+    subgraph MIN_HOST[ecs_ministral_L4]
+      MIN[(Ministral 3 8B<br/>Thinking / Planning)]
    end

-    subgraph ecsHolo["ecs_holo_A40 (164.30.22.166:8000) - Holo vLLM"]
-      holo[(Holo 1.5-7B - Vision / Grounding)]
+    subgraph HOLO_HOST[ecs_holo_A40]
+      HOLO[(Holo 1.5-7B<br/>Vision / Grounding)]
    end

-    subgraph ecsGui["GUI test target (VNC + Firefox)"]
-      vnc[VNC / Desktop]
-      browser[Firefox]
+    subgraph TARGET[GUI test target]
+      VNC[VNC / Desktop]
+      FF[Firefox]
+      VNC --> FF
    end
  end

-  cli -->|1. plan step (vLLM_THINKING_ENDPOINT)| ministral
-  ministral -->|next action (click/type/wait)| cli
+  cli -->|1. plan step<br/>vLLM_THINKING_ENDPOINT| MIN
+  MIN -->|next action<br/>click / type / wait| cli

-  cli -->|2. capture screenshot| vnc
-  vnc -->|screenshot (PNG)| cli
+  cli -->|2. capture screenshot| VNC
+  VNC -->|screenshot (PNG)| cli

-  cli -->|3. grounding request (vLLM_VISION_ENDPOINT)| holo
-  holo -->|coordinates + UI element info| cli
+  cli -->|3. grounding request<br/>vLLM_VISION_ENDPOINT| HOLO
+  HOLO -->|coordinates + UI element info| cli

-  cli -->|4. execute action (mouse/keyboard)| vnc
-  vnc --> browser
+  cli -->|4. execute action<br/>mouse / keyboard| VNC

-  cli -->|logs + screenshots (results/ folder)| artifacts[(Artifacts: logs, screenshots, JSON comms)]
+  cli -->|logs + screenshots| artifacts[(Artifacts<br/>logs, screenshots, JSON comms)]
 ```

 ## Sequence (one loop)
@ -55,9 +81,9 @@ sequenceDiagram
  autonumber
  actor U as Operator
  participant CLI as gui_agent_cli.py
-  participant MIN as Ministral vLLM\n(ecs_ministral_L4)
-  participant VNC as VNC Desktop\n(Firefox)
-  participant HOLO as Holo vLLM\n(ecs_holo_A40)
+  participant MIN as Ministral vLLM (ecs_ministral_L4)
+  participant VNC as VNC Desktop (Firefox)
+  participant HOLO as Holo vLLM (ecs_holo_A40)

  U->>CLI: Provide goal / prompt

--- a/Agent/results/golden-run-telekom-header-nav/index.md
+++ b/Agent/results/golden-run-telekom-header-nav/index.md
@ -35,32 +35,11 @@ Return findings in the 'issues' field as a list of objects:
 - problem: What doesn't work
 - recommendation: How to fix it
 If no problems found, return an empty array: []" \
-  --max-steps 30
+  --max-steps 15
 ```

 ## Artifacts

-### In-repo evidence (this page bundle)
-
-Place the evidence files here:
-
- Screenshots: `screenshots/`
- Text log: `logs/run.log`
- Optional JSON communication log(s): `logs/calibration_log_*.json`
-
-If you have ~15 screenshots, name them in a stable order, e.g.:
-
- `screenshots/uat_agent_step_001.png` … `screenshots/uat_agent_step_015.png`
-
-### Runtime output location (where they come from)
-
-The CLI defaults to:
-
- `./results/gui_agent_cli/<timestamp>/screenshots/`
- `./results/gui_agent_cli/<timestamp>/logs/run.log`
-
-Copy the files you want to publish into this page bundle so they render in the docs.
-
 ## Screenshot gallery

 ### Thumbnail grid (recommended for many screenshots)
@ -135,8 +114,3 @@ Click any thumbnail to open the full image.
  {{< figure src="screenshots/uat_agent_step_013.png" caption="Step 013" >}}

 </details>
-
-## Notes
-
- If repo size becomes an issue, publish only a curated subset (e.g. 6–8 key frames) and link to the full run folder externally.
- If you want a thumbnail grid instead of full-width figures, say so and BMad Master will add a compact gallery layout.
--- a/Agent/running-auata-scripts.md
+++ b/Agent/running-auata-scripts.md
@ -12,6 +12,8 @@ The **Autonomous UAT Agent** is the overall UX/UI testing use case built on top

 All commands below assume you are running from the **Agent-S repository root** (Linux/ECS), `~/Projects/Agent_S3/Agent-S`. To do that, connect to the server via SSH. You will need a key pair for authentication and an open inbound port in the firewall. For information on how to obtain the key pair and request firewall access, contact [tom.sakretz@telekom.de](mailto:tom.sakretz@telekom.de).

+## Template for running a script from command line terminal
+
 ### 1) Connect from Windows

 ```powershell
@ -36,7 +38,7 @@ firefox &

 ### 3) One-command recommended run (ECS)

-If you only run one thing to produce clean, repeatable evidence (screenshots with click markers), run the following command CLI:
+If you only want to produce clean, repeatable evidence (screenshots with click markers), run the following command CLI:

 ```bash
 python staging_scripts/gui_agent_cli.py --prompt "Go to telekom.de and click the cart icon" --max-steps 10