Minor updates to existing pages
This commit is contained in:
parent
242a4b8a79
commit
097724558a
4 changed files with 51 additions and 58 deletions
|
|
@ -10,12 +10,3 @@ description: >
|
|||
|
||||
This section contains the core documentation for D66, focusing on how the Autonomous UAT Agent works and how to run it.
|
||||
|
||||
## Pages
|
||||
|
||||
- [Overview](./overview.md)
|
||||
- [Quickstart](./quickstart.md)
|
||||
- [Running Autonomous UAT Agent Scripts](./running-auata-scripts.md)
|
||||
- [Workflow Diagram](./agent-workflow-diagram.md)
|
||||
- [Model Stack](./model-stack.md)
|
||||
- [Outputs & Artifacts](./outputs-and-artifacts.md)
|
||||
- [Troubleshooting](./troubleshooting.md)
|
||||
|
|
|
|||
|
|
@ -10,42 +10,68 @@ description: >
|
|||
|
||||
This page provides a **visual sketch** of the typical workflow (example: `gui_agent_cli.py`).
|
||||
|
||||
## Workflow (fallback without Mermaid)
|
||||
|
||||
If Mermaid rendering is not available or fails in your build, this section shows the same workflow as plain text.
|
||||
|
||||
```text
|
||||
Operator/Prompt
|
||||
-> gui_agent_cli.py
|
||||
-> (1) Planning request -> Ministral vLLM (thinking)
|
||||
<- Next action intent
|
||||
-> (2) Screenshot capture -> VNC Desktop / Firefox
|
||||
<- PNG screenshot
|
||||
-> (3) Grounding request -> Holo vLLM (vision)
|
||||
<- Coordinates + element metadata
|
||||
-> (4) Execute action -> VNC Desktop / Firefox
|
||||
-> Artifacts saved -> results/ (logs, screenshots, JSON)
|
||||
```
|
||||
|
||||
| Step | From | To | What | Output |
|
||||
|---:|---|---|---|---|
|
||||
| 0 | Operator | gui_agent_cli.py | Provide goal / prompt | Goal text |
|
||||
| 1 | gui_agent_cli.py | Ministral vLLM | Plan next step (text) | Next action intent |
|
||||
| 2 | gui_agent_cli.py | VNC Desktop | Capture screenshot | PNG screenshot |
|
||||
| 3 | gui_agent_cli.py | Holo vLLM | Ground UI element(s) | Coordinates + element metadata |
|
||||
| 4 | gui_agent_cli.py | VNC Desktop | Execute click/type/scroll | UI state change |
|
||||
| 5 | gui_agent_cli.py | results/ | Persist evidence | Logs + screenshots + JSON |
|
||||
|
||||
## High-level data flow
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
%% Left-to-right overview of one typical agent loop
|
||||
|
||||
user[Operator / Prompt] --> cli[Agent S script (gui_agent_cli.py)]
|
||||
user[Operator / Prompt] --> cli[Agent S script<br/>gui_agent_cli.py]
|
||||
|
||||
subgraph otc[OTC (Open Telekom Cloud)]
|
||||
subgraph ecsMin["ecs_ministral_L4 (164.30.28.242:8001) - Ministral vLLM"]
|
||||
ministral[(Ministral 3 8B - Thinking / Planning)]
|
||||
subgraph OTC[OTC (Open Telekom Cloud)]
|
||||
subgraph MIN_HOST[ecs_ministral_L4]
|
||||
MIN[(Ministral 3 8B<br/>Thinking / Planning)]
|
||||
end
|
||||
|
||||
subgraph ecsHolo["ecs_holo_A40 (164.30.22.166:8000) - Holo vLLM"]
|
||||
holo[(Holo 1.5-7B - Vision / Grounding)]
|
||||
subgraph HOLO_HOST[ecs_holo_A40]
|
||||
HOLO[(Holo 1.5-7B<br/>Vision / Grounding)]
|
||||
end
|
||||
|
||||
subgraph ecsGui["GUI test target (VNC + Firefox)"]
|
||||
vnc[VNC / Desktop]
|
||||
browser[Firefox]
|
||||
subgraph TARGET[GUI test target]
|
||||
VNC[VNC / Desktop]
|
||||
FF[Firefox]
|
||||
VNC --> FF
|
||||
end
|
||||
end
|
||||
|
||||
cli -->|1. plan step (vLLM_THINKING_ENDPOINT)| ministral
|
||||
ministral -->|next action (click/type/wait)| cli
|
||||
cli -->|1. plan step<br/>vLLM_THINKING_ENDPOINT| MIN
|
||||
MIN -->|next action<br/>click / type / wait| cli
|
||||
|
||||
cli -->|2. capture screenshot| vnc
|
||||
vnc -->|screenshot (PNG)| cli
|
||||
cli -->|2. capture screenshot| VNC
|
||||
VNC -->|screenshot (PNG)| cli
|
||||
|
||||
cli -->|3. grounding request (vLLM_VISION_ENDPOINT)| holo
|
||||
holo -->|coordinates + UI element info| cli
|
||||
cli -->|3. grounding request<br/>vLLM_VISION_ENDPOINT| HOLO
|
||||
HOLO -->|coordinates + UI element info| cli
|
||||
|
||||
cli -->|4. execute action (mouse/keyboard)| vnc
|
||||
vnc --> browser
|
||||
cli -->|4. execute action<br/>mouse / keyboard| VNC
|
||||
|
||||
cli -->|logs + screenshots (results/ folder)| artifacts[(Artifacts: logs, screenshots, JSON comms)]
|
||||
cli -->|logs + screenshots| artifacts[(Artifacts<br/>logs, screenshots, JSON comms)]
|
||||
```
|
||||
|
||||
## Sequence (one loop)
|
||||
|
|
@ -55,9 +81,9 @@ sequenceDiagram
|
|||
autonumber
|
||||
actor U as Operator
|
||||
participant CLI as gui_agent_cli.py
|
||||
participant MIN as Ministral vLLM\n(ecs_ministral_L4)
|
||||
participant VNC as VNC Desktop\n(Firefox)
|
||||
participant HOLO as Holo vLLM\n(ecs_holo_A40)
|
||||
participant MIN as Ministral vLLM (ecs_ministral_L4)
|
||||
participant VNC as VNC Desktop (Firefox)
|
||||
participant HOLO as Holo vLLM (ecs_holo_A40)
|
||||
|
||||
U->>CLI: Provide goal / prompt
|
||||
|
||||
|
|
|
|||
|
|
@ -35,32 +35,11 @@ Return findings in the 'issues' field as a list of objects:
|
|||
- problem: What doesn't work
|
||||
- recommendation: How to fix it
|
||||
If no problems found, return an empty array: []" \
|
||||
--max-steps 30
|
||||
--max-steps 15
|
||||
```
|
||||
|
||||
## Artifacts
|
||||
|
||||
### In-repo evidence (this page bundle)
|
||||
|
||||
Place the evidence files here:
|
||||
|
||||
- Screenshots: `screenshots/`
|
||||
- Text log: `logs/run.log`
|
||||
- Optional JSON communication log(s): `logs/calibration_log_*.json`
|
||||
|
||||
If you have ~15 screenshots, name them in a stable order, e.g.:
|
||||
|
||||
- `screenshots/uat_agent_step_001.png` … `screenshots/uat_agent_step_015.png`
|
||||
|
||||
### Runtime output location (where they come from)
|
||||
|
||||
The CLI defaults to:
|
||||
|
||||
- `./results/gui_agent_cli/<timestamp>/screenshots/`
|
||||
- `./results/gui_agent_cli/<timestamp>/logs/run.log`
|
||||
|
||||
Copy the files you want to publish into this page bundle so they render in the docs.
|
||||
|
||||
## Screenshot gallery
|
||||
|
||||
### Thumbnail grid (recommended for many screenshots)
|
||||
|
|
@ -135,8 +114,3 @@ Click any thumbnail to open the full image.
|
|||
{{< figure src="screenshots/uat_agent_step_013.png" caption="Step 013" >}}
|
||||
|
||||
</details>
|
||||
|
||||
## Notes
|
||||
|
||||
- If repo size becomes an issue, publish only a curated subset (e.g. 6–8 key frames) and link to the full run folder externally.
|
||||
- If you want a thumbnail grid instead of full-width figures, say so and BMad Master will add a compact gallery layout.
|
||||
|
|
|
|||
|
|
@ -12,6 +12,8 @@ The **Autonomous UAT Agent** is the overall UX/UI testing use case built on top
|
|||
|
||||
All commands below assume you are running from the **Agent-S repository root** (Linux/ECS), `~/Projects/Agent_S3/Agent-S`. To do that, connect to the server via SSH. You will need a key pair for authentication and an open inbound port in the firewall. For information on how to obtain the key pair and request firewall access, contact [tom.sakretz@telekom.de](mailto:tom.sakretz@telekom.de).
|
||||
|
||||
## Template for running a script from command line terminal
|
||||
|
||||
### 1) Connect from Windows
|
||||
|
||||
```powershell
|
||||
|
|
@ -36,7 +38,7 @@ firefox &
|
|||
|
||||
### 3) One-command recommended run (ECS)
|
||||
|
||||
If you only run one thing to produce clean, repeatable evidence (screenshots with click markers), run the following command CLI:
|
||||
If you only want to produce clean, repeatable evidence (screenshots with click markers), run the following command CLI:
|
||||
|
||||
```bash
|
||||
python staging_scripts/gui_agent_cli.py --prompt "Go to telekom.de and click the cart icon" --max-steps 10
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue