Merge branch 'development' into idpbuilder-backstage-templates

2024-11-08 13:55:44 +01:00 · 2024-11-08 13:55:44 +01:00 · a68629559d
commit a68629559d
parent 8ce9468e57 3f2bbdaf48
10 changed files with 488 additions and 24 deletions
--- a/content/en/docs/solution/tools/CNOE/argocd/_index.md
+++ b/content/en/docs/solution/tools/CNOE/argocd/_index.md
@ -0,0 +1,141 @@
+---
+title: ArgoCD
+weight: 30
+description: A description of ArgoCD and its role in CNOE
+---
+
+## What is ArgoCD?
+
+ArgoCD is a Continuous Delivery tool for kubernetes based on GitOps principles.
+
+> ELI5: ArgoCD is an application running in kubernetes which monitors Git
+> repositories containing some sort of kubernetes manifests and automatically
+> deploys them to some configured kubernetes clusters.
+
+From ArgoCD's perspective, applications are defined as custom resource
+definitions within the kubernetes clusters that ArgoCD monitors. Such a
+definition describes a source git repository that contains kubernetes
+manifests, in the form of a helm chart, kustomize, jsonnet definitions or plain
+yaml files, as well as a target kubernetes cluster and namespace the manifests
+should be applied to. Thus, ArgoCD is capable of deploying applications to
+various (remote) clusters and namespaces.
+
+ArgoCD monitors both the source and the destination. It applies changes from
+the git repository that acts as the source of truth for the destination as soon
+as they occur, i.e. if a change was pushed to the git repository, the change is
+applied to the kubernetes destination by ArgoCD. Subsequently, it checks
+whether the desired state was established. For example, it verifies that all
+resources were created, enough replicas started, and that all pods are in the
+`running` state and healthy.
+
+## Architecture
+
+### Core Components
+
+An ArgoCD deployment mainly consists of 3 main components:
+
+#### Application Controller
+
+The application controller is a kubernetes operator that synchronizes the live
+state within a kubernetes cluster with the desired state derived from the git
+sources. It monitors the live state, can detect derivations, and perform
+corrective actions. Additionally, it can execute hooks on life cycle stages
+such as pre- and post-sync.
+
+#### Repository Server
+
+The repository server interacts with git repositories and caches their state,
+to reduce the amount of polling necessary. Furthermore, it is responsible for
+generating the kubernetes manifests from the resources within the git
+repositories, i.e. executing helm or jsonnet templates.
+
+#### API Server
+
+The API Server is a REST/gRPC Service that allows the Web UI and CLI, as well
+as other API clients, to interact with the system. It also acts as the callback
+for webhooks particularly from Git repository platforms such as GitHub or
+Gitlab to reduce repository polling.
+
+### Others
+
+The system primarily stores its configuration as kubernetes resources. Thus,
+other external storage is not vital.
+
+Redis
+: A Redis store is optional but recommended to be used as a cache to reduce
+load on ArgoCD components and connected systems, e.g. git repositories.
+
+ApplicationSetController
+: The ApplicationSet Controller is similar to the Application Controller a
+kubernetes operator that can deploy applications based on parameterized
+application templates. This allows the deployment of different versions of an
+application into various environments from a single template.
+
+### Overview
+
+![Conceptual Architecture](./argocd_architecture.webp)
+
+![Core components](./argocd-core-components.webp)
+
+## Role in CNOE
+
+ArgoCD is one of the core components besides gitea/forgejo that is being
+bootstrapped by the idpbuilder. Future project creation, e.g. through
+backstage, relies on the availability of ArgoCD.
+
+After the initial bootstrapping phase, effectively all components in the stack
+that are deployed in kubernetes are managed by ArgoCD. This includes the
+bootstrapped components of gitea and ArgoCD which are onboarded afterward.
+Thus, the idpbuilder is only necessary in the bootstrapping phase of the
+platform and the technical coordination of all components shifts to ArgoCD
+eventually.
+
+In general, the creation of new projects and applications should take place in
+backstop. It is a catalog of software components and best practices that allows
+developers to grasp and to manage their software portfolio. Underneath,
+however, the deployment of applications and platform components is managed by
+ArgoCD. Among others, backstage creates Application CRDs to instruct ArgoCD to
+manage deployments and subsequently report on their current state.
+
+## Glossary
+
+_Initially shamelessly copied from [the docs](https://argo-cd.readthedocs.io/en/stable/core_concepts/)_
+
+Application
+: A group of Kubernetes resources as defined by a manifest. This is a Custom Resource Definition (CRD).
+
+ApplicationSet
+: A CRD that is a template that can create multiple parameterized Applications.
+
+Application source type
+: Which Tool is used to build the application.
+
+Configuration management tool
+: See Tool.
+
+Configuration management plugin
+: A custom tool.
+
+Health
+: The health of the application, is it running correctly? Can it serve requests?
+
+Live state
+: The live state of that application. What pods etc are deployed.
+
+Refresh
+: Compare the latest code in Git with the live state. Figure out what is different.
+
+Sync
+: The process of making an application move to its target state. E.g. by applying changes to a Kubernetes cluster.
+
+Sync status
+: Whether or not the live state matches the target state. Is the deployed application the same as Git says it should be?
+
+Sync operation status
+: Whether or not a sync succeeded.
+
+Target state
+: The desired state of an application, as represented by files in a Git repository.
+
+Tool
+: A tool to create manifests from a directory of files. E.g. Kustomize. See Application Source Type.
--- a/content/en/docs/solution/tools/CNOE/argocd/argocd-core-components.webp
+++ b/content/en/docs/solution/tools/CNOE/argocd/argocd-core-components.webp
--- a/content/en/docs/solution/tools/CNOE/argocd/argocd_architecture.webp
+++ b/content/en/docs/solution/tools/CNOE/argocd/argocd_architecture.webp
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md
@ -55,3 +55,124 @@ if necessary.

 DNS solutions like `nip.io` or the already used `localtest.me` mitigate the
 need for path based routing
+
+## Excerpt
+
+HTTP is a cornerstone of the internet due to its high flexibility. Starting
+from HTTP/1.1 each request in the protocol contains among others a path and a
+`Host`name in its header. While an HTTP request is sent to a single IP address
+/ server, these two pieces of data allow (distributed) systems to handle
+requests in various ways.
+
+```shell
+$ curl -v http://google.com/something > /dev/null
+
+* Connected to google.com (2a00:1450:4001:82f::200e) port 80
+* using HTTP/1.x
+> GET /something HTTP/1.1
+> Host: google.com
+> User-Agent: curl/8.10.1
+> Accept: */*
+...
+```
+
+### Path-Routing
+
+Imagine requesting `http://myhost.foo/some/file.html`, in a simple setup, the
+web server `myhost.foo` resolves to would serve static files from some
+directory, `/<some_dir>/some/file.html`.
+
+In more complex systems, one might have multiple services that fulfill various
+roles, for example a service that generates HTML sites of articles from a CMS
+and a service that can convert images into various formats. Using path-routing
+both services are available on the same host from a user's POV.
+
+An article served from `http://myhost.foo/articles/news1.html` would be
+generated from the article service and points to an image
+`http://myhost.foo/images/pic.jpg` which in turn is generated by the image
+converter service. When a user sends an HTTP request to `myhost.foo`, they hit
+a reverse proxy which forwards the request based on the requested path to some
+other system, waits for a response, and subsequently returns that response to
+the user.
+
+![Path-Routing Example](../path-routing.png)
+
+Such a setup hides the complexity from the user and allows the creation of
+large distributed, scalable systems acting as a unified entity from the
+outside. Since everything is served on the same host, the browser is inclined
+to trust all downstream services. This allows for easier 'communication'
+between services through the browser. For example, cookies could be valid for
+the entire host and thus authentication data could be forwarded to requested
+downstream services without the user having to explicitly re-authenticate.
+
+Furthermore, services 'know' their user-facing location by knowing their path
+and the paths to other services as paths are usually set as a convention and /
+or hard-coded. In practice, this makes configuration of the entire system
+somewhat easier, especially if you have various environments for testing,
+development, and production. The hostname of the system does not matter as one
+can use hostname-relative URLs, e.g. `/some/service`.
+
+Load balancing is also easily achievable by multiplying the number of service
+instances. Most reverse proxy systems are able to apply various load balancing
+strategies to forward traffic to downstream systems.
+
+Problems might arise if downstream systems are not built with path-routing in
+mind. Some systems require to be served from the root of a domain, see for
+example the container registry spec.
+
+
+### Hostname-Routing
+
+Each downstream service in a distributed system is served from a different
+host, typically a subdomain, e.g. `serviceA.myhost.foo` and
+`serviceB.myhost.foo`. This gives services full control over their respective
+host, and even allows them to do path-routing within each system. Moreover,
+hostname-routing allows the entire system to create more flexible and powerful
+routing schemes in terms of scalability. Intra-system communication becomes
+somewhat harder as the browser treats each subdomain as a separate host,
+shielding cookies for example form one another.
+
+Each host that serves some services requires a DNS entry that has to be
+published to the clients (from some DNS server). Depending on the environment
+this can become quite tedious as DNS resolution on the internet and intranets
+might have to deviate. This applies to intra-cluster communication as well, as
+seen with the idpbuilder's platform. In this case, external DNS resolution has
+to be replicated within the cluster to be able to use the same URLs to address
+for example gitea.
+
+The following example depicts DNS-only routing. By defining separate DNS
+entries for each service / subdomain requests are resolved to the respective
+servers. In theory, no additional infrastructure is necessary to route user
+traffic to each service. However, as services are completely separated other
+infrastructure like authentication possibly has to be duplicated.
+
+![DNS-only routing](../hostname-routing.png)
+
+When using hostname based routing, one does not have to set different IPs for
+each hostname. Instead, having multiple DNS entries pointing to the same set of
+IPs allows re-using existing infrastructure. As shown below, a reverse proxy is
+able to forward requests to downstream services based on the `Host` request
+parameter. This way specific hostname can be forwarded to a defined service.
+
+![Hostname Proxy](../hostname-routing-proxy.png)
+
+At the same time, one could imagine a multi-tenant system that differentiates
+customer systems by name, e.g. `tenant-1.cool.system` and
+`tenant-2.cool.system`. Configured as a wildcard-sytle domain, `*.cool.system`
+could point to a reverse proxy that forwards requests to a tenants instance of
+a system, allowing re-use of central infrastructure while still hosting
+separate systems per tenant.
+
+
+The implicit dependency on DNS resolution generally makes this kind of routing
+more complex and error-prone as changes to DNS server entries are not always
+possible or modifiable by everyone. Also, local changes to your `/etc/hosts`
+file are a constant pain and should be seen as a dirty hack. As mentioned
+above, dynamic DNS solutions like `nip.io` are often helpful in this case.
+
+### Conclusion
+
+Path and hostname based routing are the two most common methods of HTTP traffic
+routing. They can be used separately but more often they are used in
+conjunction. Due to HTTP's versatility other forms of HTTP routing, for example
+based on the `Content-Type` Header are also very common.
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png