2024-11-08 12:30:31 +00:00
4 changed files with 121 additions and 0 deletions
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md
@ -55,3 +55,124 @@ if necessary.

 DNS solutions like `nip.io` or the already used `localtest.me` mitigate the
 need for path based routing
+
+## Excerpt
+
+HTTP is a cornerstone of the internet due to its high flexibility. Starting
+from HTTP/1.1 each request in the protocol contains among others a path and a
+`Host`name in its header. While an HTTP request is sent to a single IP address
+/ server, these two pieces of data allow (distributed) systems to handle
+requests in various ways.
+
+```shell
+$ curl -v http://google.com/something > /dev/null
+
+* Connected to google.com (2a00:1450:4001:82f::200e) port 80
+* using HTTP/1.x
+> GET /something HTTP/1.1
+> Host: google.com
+> User-Agent: curl/8.10.1
+> Accept: */*
+...
+```
+
+### Path-Routing
+
+Imagine requesting `http://myhost.foo/some/file.html`, in a simple setup, the
+web server `myhost.foo` resolves to would serve static files from some
+directory, `/<some_dir>/some/file.html`.
+
+In more complex systems, one might have multiple services that fulfill various
+roles, for example a service that generates HTML sites of articles from a CMS
+and a service that can convert images into various formats. Using path-routing
+both services are available on the same host from a user's POV.
+
+An article served from `http://myhost.foo/articles/news1.html` would be
+generated from the article service and points to an image
+`http://myhost.foo/images/pic.jpg` which in turn is generated by the image
+converter service. When a user sends an HTTP request to `myhost.foo`, they hit
+a reverse proxy which forwards the request based on the requested path to some
+other system, waits for a response, and subsequently returns that response to
+the user.
+
+![Path-Routing Example](../path-routing.png)
+
+Such a setup hides the complexity from the user and allows the creation of
+large distributed, scalable systems acting as a unified entity from the
+outside. Since everything is served on the same host, the browser is inclined
+to trust all downstream services. This allows for easier 'communication'
+between services through the browser. For example, cookies could be valid for
+the entire host and thus authentication data could be forwarded to requested
+downstream services without the user having to explicitly re-authenticate.
+
+Furthermore, services 'know' their user-facing location by knowing their path
+and the paths to other services as paths are usually set as a convention and /
+or hard-coded. In practice, this makes configuration of the entire system
+somewhat easier, especially if you have various environments for testing,
+development, and production. The hostname of the system does not matter as one
+can use hostname-relative URLs, e.g. `/some/service`.
+
+Load balancing is also easily achievable by multiplying the number of service
+instances. Most reverse proxy systems are able to apply various load balancing
+strategies to forward traffic to downstream systems.
+
+Problems might arise if downstream systems are not built with path-routing in
+mind. Some systems require to be served from the root of a domain, see for
+example the container registry spec.
+
+
+### Hostname-Routing
+
+Each downstream service in a distributed system is served from a different
+host, typically a subdomain, e.g. `serviceA.myhost.foo` and
+`serviceB.myhost.foo`. This gives services full control over their respective
+host, and even allows them to do path-routing within each system. Moreover,
+hostname-routing allows the entire system to create more flexible and powerful
+routing schemes in terms of scalability. Intra-system communication becomes
+somewhat harder as the browser treats each subdomain as a separate host,
+shielding cookies for example form one another.
+
+Each host that serves some services requires a DNS entry that has to be
+published to the clients (from some DNS server). Depending on the environment
+this can become quite tedious as DNS resolution on the internet and intranets
+might have to deviate. This applies to intra-cluster communication as well, as
+seen with the idpbuilder's platform. In this case, external DNS resolution has
+to be replicated within the cluster to be able to use the same URLs to address
+for example gitea.
+
+The following example depicts DNS-only routing. By defining separate DNS
+entries for each service / subdomain requests are resolved to the respective
+servers. In theory, no additional infrastructure is necessary to route user
+traffic to each service. However, as services are completely separated other
+infrastructure like authentication possibly has to be duplicated.
+
+![DNS-only routing](../hostname-routing.png)
+
+When using hostname based routing, one does not have to set different IPs for
+each hostname. Instead, having multiple DNS entries pointing to the same set of
+IPs allows re-using existing infrastructure. As shown below, a reverse proxy is
+able to forward requests to downstream services based on the `Host` request
+parameter. This way specific hostname can be forwarded to a defined service.
+
+![Hostname Proxy](../hostname-routing-proxy.png)
+
+At the same time, one could imagine a multi-tenant system that differentiates
+customer systems by name, e.g. `tenant-1.cool.system` and
+`tenant-2.cool.system`. Configured as a wildcard-sytle domain, `*.cool.system`
+could point to a reverse proxy that forwards requests to a tenants instance of
+a system, allowing re-use of central infrastructure while still hosting
+separate systems per tenant.
+
+
+The implicit dependency on DNS resolution generally makes this kind of routing
+more complex and error-prone as changes to DNS server entries are not always
+possible or modifiable by everyone. Also, local changes to your `/etc/hosts`
+file are a constant pain and should be seen as a dirty hack. As mentioned
+above, dynamic DNS solutions like `nip.io` are often helpful in this case.
+
+### Conclusion
+
+Path and hostname based routing are the two most common methods of HTTP traffic
+routing. They can be used separately but more often they are used in
+conjunction. Due to HTTP's versatility other forms of HTTP routing, for example
+based on the `Content-Type` Header are also very common.
--- a/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png
+++ b/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png