docs(operations): added operations chapter
Some checks failed
Hugo Site Tests / test (push) Failing after 1s
ci / build (push) Successful in 55s

This commit is contained in:
Stephan Lo 2025-12-16 12:16:35 +01:00
parent c75aa06d04
commit 9c16f17968
5 changed files with 35 additions and 45 deletions

View file

@ -3,72 +3,62 @@ title: "Operations"
linkTitle: "Operations"
weight: 40
description: >
Operational guides for deploying, monitoring, and maintaining the Edge Developer Platform.
Operational guides for deploying, monitoring, and maintaining the Edge Developer Platform components.
---
{{% alert title="Draft" color="warning" %}}
**Editorial Status**: This page is currently being developed.
* **Jira Ticket**: TBD
* **Assignee**: Team
* **Status**: Draft - Structure only
* **Last Updated**: 2025-11-16
* **TODO**:
* [ ] Add deployment procedures
* [ ] Document monitoring setup and dashboards
* [ ] Include troubleshooting guides
* [ ] Add maintenance procedures
{{% /alert %}}
## Operations Overview
This section covers operational aspects of the Edge Developer Platform including deployment, monitoring, troubleshooting, and maintenance.
This section covers operational aspects of the Edge Developer Platform. In general there is no operation - it's just monitoring and fixing in an developer mode of operation.
## Deployment
## Deployments
### Platform Deployment
### EDP
Instructions for deploying EDP components to your infrastructure.
EDP is running on two OTC clusters (remember: this just means that we twice run the [infra-deploy pipeline](https://edp.buildth.ing/DevFW/infra-deploy/actions?workflow=deploy.yaml&actor=0&status=0) as eyerything is code!)
### Application Deployment
![alt text](otc-hub.png)
#### Further references for infrastructural informations
* OTC:
* [IPCEI-CIS Confluence](https://confluence.telekom-mms.com/spaces/IPCEICIS/pages/1000105031/OTC)
### Edge Connect
The Edge and Orka clouds in Edge Connect are in the perspective of eDF just deployment targets of EDP applications. Thus we are user of them. Physically both are Gardener Kubernetes clusters.
Basically it is intended for users just to use the web ui: https://hub.apps.edge.platform.mg3.mdb.osc.live
![alt text](edge-hub.png)
#### Further references for infrastructural informations
![alt text](gardener.png)
But we also got access on the cluster level for operations issues, see picture above. How to get access is described in the following links:
* [IPCEI-CIS Confluence](https://confluence.telekom-mms.com/spaces/IPCEICIS/pages/1122869593/Edge+Cloud)
* [IPCEI-CIS Jira](https://jira.telekom-mms.com/browse/IPCEICIS-6222?focusedId=3411527&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-3411527)
Guides for deploying applications to the platform using available deployment methods.
## Monitoring & Observability
### Metrics
On EDP the observability cluster is meant to monitor the platform stacks, e.g. by [Grafana](https://grafana.observability.buildth.ing).
But there is no operational monitoring lifecycle in place. we didn't define metrics or alerts as there is no operational mode yet.
Access and interpret platform and application metrics using VictoriaMetrics, Prometheus, and Grafana.
### Logging
Log aggregation and analysis for troubleshooting and audit purposes.
### Alerting
Configure alerts for critical platform events and application issues.
## Troubleshooting
Common issues and their solutions for platform operations.
![alt text](edp-grafana.png)
## Maintenance
We maintain EDP on an per issue driven strategy.
### Updates & Upgrades
Procedures for updating platform components and maintaining system health.
We update occassionally per [component](../components/), which either can be a [platform application](../components/orchestration/stacks/) or an [orchestration pipeline](../components/forgejo/actions/actions.md).
### Backup & Recovery
Data backup strategies and disaster recovery procedures.
EDP Customer data is backuped, see https://jira.telekom-mms.com/browse/IPCEICIS-5017
## Documentation Template
When creating operational documentation:
1. **Purpose**: What this operation achieves
2. **Prerequisites**: Required access, tools, and knowledge
3. **Procedure**: Step-by-step instructions with commands
4. **Verification**: How to confirm successful completion
5. **Rollback**: How to revert if needed
6. **Troubleshooting**: Common issues and solutions

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB