diff --git a/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png new file mode 100644 index 0000000..d100481 Binary files /dev/null and b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing-proxy.png differ diff --git a/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png new file mode 100644 index 0000000..a6b9742 Binary files /dev/null and b/content/en/docs/solution/tools/CNOE/idpbuilder/hostname-routing.png differ diff --git a/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md b/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md index 21e709d..f2da697 100644 --- a/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md +++ b/content/en/docs/solution/tools/CNOE/idpbuilder/http-routing.md @@ -55,3 +55,124 @@ if necessary. DNS solutions like `nip.io` or the already used `localtest.me` mitigate the need for path based routing + +## Excerpt + +HTTP is a cornerstone of the internet due to its high flexibility. Starting +from HTTP/1.1 each request in the protocol contains among others a path and a +`Host`name in its header. While an HTTP request is sent to a single IP address +/ server, these two pieces of data allow (distributed) systems to handle +requests in various ways. + +```shell +$ curl -v http://google.com/something > /dev/null + +* Connected to google.com (2a00:1450:4001:82f::200e) port 80 +* using HTTP/1.x +> GET /something HTTP/1.1 +> Host: google.com +> User-Agent: curl/8.10.1 +> Accept: */* +... +``` + +### Path-Routing + +Imagine requesting `http://myhost.foo/some/file.html`, in a simple setup, the +web server `myhost.foo` resolves to would serve static files from some +directory, `//some/file.html`. + +In more complex systems, one might have multiple services that fulfill various +roles, for example a service that generates HTML sites of articles from a CMS +and a service that can convert images into various formats. Using path-routing +both services are available on the same host from a user's POV. + +An article served from `http://myhost.foo/articles/news1.html` would be +generated from the article service and points to an image +`http://myhost.foo/images/pic.jpg` which in turn is generated by the image +converter service. When a user sends an HTTP request to `myhost.foo`, they hit +a reverse proxy which forwards the request based on the requested path to some +other system, waits for a response, and subsequently returns that response to +the user. + +![Path-Routing Example](../path-routing.png) + +Such a setup hides the complexity from the user and allows the creation of +large distributed, scalable systems acting as a unified entity from the +outside. Since everything is served on the same host, the browser is inclined +to trust all downstream services. This allows for easier 'communication' +between services through the browser. For example, cookies could be valid for +the entire host and thus authentication data could be forwarded to requested +downstream services without the user having to explicitly re-authenticate. + +Furthermore, services 'know' their user-facing location by knowing their path +and the paths to other services as paths are usually set as a convention and / +or hard-coded. In practice, this makes configuration of the entire system +somewhat easier, especially if you have various environments for testing, +development, and production. The hostname of the system does not matter as one +can use hostname-relative URLs, e.g. `/some/service`. + +Load balancing is also easily achievable by multiplying the number of service +instances. Most reverse proxy systems are able to apply various load balancing +strategies to forward traffic to downstream systems. + +Problems might arise if downstream systems are not built with path-routing in +mind. Some systems require to be served from the root of a domain, see for +example the container registry spec. + + +### Hostname-Routing + +Each downstream service in a distributed system is served from a different +host, typically a subdomain, e.g. `serviceA.myhost.foo` and +`serviceB.myhost.foo`. This gives services full control over their respective +host, and even allows them to do path-routing within each system. Moreover, +hostname-routing allows the entire system to create more flexible and powerful +routing schemes in terms of scalability. Intra-system communication becomes +somewhat harder as the browser treats each subdomain as a separate host, +shielding cookies for example form one another. + +Each host that serves some services requires a DNS entry that has to be +published to the clients (from some DNS server). Depending on the environment +this can become quite tedious as DNS resolution on the internet and intranets +might have to deviate. This applies to intra-cluster communication as well, as +seen with the idpbuilder's platform. In this case, external DNS resolution has +to be replicated within the cluster to be able to use the same URLs to address +for example gitea. + +The following example depicts DNS-only routing. By defining separate DNS +entries for each service / subdomain requests are resolved to the respective +servers. In theory, no additional infrastructure is necessary to route user +traffic to each service. However, as services are completely separated other +infrastructure like authentication possibly has to be duplicated. + +![DNS-only routing](../hostname-routing.png) + +When using hostname based routing, one does not have to set different IPs for +each hostname. Instead, having multiple DNS entries pointing to the same set of +IPs allows re-using existing infrastructure. As shown below, a reverse proxy is +able to forward requests to downstream services based on the `Host` request +parameter. This way specific hostname can be forwarded to a defined service. + +![Hostname Proxy](../hostname-routing-proxy.png) + +At the same time, one could imagine a multi-tenant system that differentiates +customer systems by name, e.g. `tenant-1.cool.system` and +`tenant-2.cool.system`. Configured as a wildcard-sytle domain, `*.cool.system` +could point to a reverse proxy that forwards requests to a tenants instance of +a system, allowing re-use of central infrastructure while still hosting +separate systems per tenant. + + +The implicit dependency on DNS resolution generally makes this kind of routing +more complex and error-prone as changes to DNS server entries are not always +possible or modifiable by everyone. Also, local changes to your `/etc/hosts` +file are a constant pain and should be seen as a dirty hack. As mentioned +above, dynamic DNS solutions like `nip.io` are often helpful in this case. + +### Conclusion + +Path and hostname based routing are the two most common methods of HTTP traffic +routing. They can be used separately but more often they are used in +conjunction. Due to HTTP's versatility other forms of HTTP routing, for example +based on the `Content-Type` Header are also very common. diff --git a/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png b/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png new file mode 100644 index 0000000..0e5a576 Binary files /dev/null and b/content/en/docs/solution/tools/CNOE/idpbuilder/path-routing.png differ