Kubernetes at Home: Internal and external services


Disclaimer: Separating at a hardware level will always be better. But my home lab consists of exactly one server, so I focus on what I can do in software in Kubernetes.

So far, I have configured all my services to be exposed to the internet, no matter if they are for external or internal consumption. I have used a traefik middleware to limit access to the internal ones. This probably is sufficient for non-important services – but can we do better?

As always – the answer is yes we can!

Internal DMZ

So far, I have only one DMZ network that MetallB manages. Now, of course you can create rules to allow only traffic from the internal network to those for internal consumption, but this is error-prone and a recipe for human errors. And as the load balancer IPs are dynamically provisioned, I run the risk of something that should not be exposed externally getting the IP of a decommisioned externally facing service, where I have forgotten to remove the firewall rule.

This can be solved by an extra DMZ, and internal DMZ, which gets no firewall openings from Internett, only from the internal network.

Components needed

VLAN

We need another VLAN on our physical network. I have called it kubernetesinternaldmz and given it VLAN ID 9.

I have given it the ip range 192.168.5.0/24, and Unifi has gven it a public ipv6 network from my external range, 2001:db8:123:f10b::/64. Given that this is a network that I intend to never expose to the Internet, it would have course been possible to give it a private ipv6 range too, but I have IPv6 addresses to spare. So far!

On my kubernetes node, I need to bring this in on a new 802.1q tagged interface. I have called my external DMZ interface kdmz01, so let’s call this one kdmz02. I have covered how to do this in part 1 and part 2, so I’ll not go in to the details on how this is configured.

Load balancers

To get this to work, I need to have metallB handle the new network, with a new pool. As I have a whole class C network with 256 (well, almost for the pedantics out there) ip addresses to spare, I’ll create the pool as a dualstack, having both ipv6 and ipv4 assigned via the same LoadBalancer services.

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: public-pool-internal
namespace: metallb-system
annotations:
ipchanger.alpha.kubernetes.io/patch: "true"
spec:
addresses:
- 2001:db8:123:f10b:2::1-2001:db8:123:f10b:2::ffff
- 192.168.5.10-192.168.5.200
autoAssign: false # Prevents pods from accidentally grabbing these IPs

As you can see, I haven’t used it all in a range. There’s no good reason other than I don’t foresee getting that many services, and if I want to configure internal DMZ services through other things, I could put them on the same network but outside these ranges.

I need to have it announced on layer two, too:

kind: L2Advertisement
metadata:
name: public-l2-adv-internal
namespace: metallb-system
spec:
interfaces:
- kdmz02
ipAddressPools:
- public-pool-internal

Now, I can reconfigure my load balancer services to be exposed on the internal DMZ instead of the external by specifying this LB pool, as for example for my portainer service:

apiVersion: v1
kind: Service
metadata:
name: traefik-portainer
namespace: traefik-internal
annotations:
metallb.universe.tf/address-pool: public-pool-internal
external-dns.alpha.kubernetes.io/hostname: portainer.engen.priv.no
external-dns.alpha.kubernetes.io/ttl: "300"
ipchanger.alpha.kubernetes.io/patch: "true"
spec:
externalTrafficPolicy: Local
type: LoadBalancer
ipFamilyPolicy: PreferDualStack
ipFamilies:
- IPv6
- IPv4
ports:
- name: web
port: 80
- name: websecure
port: 443
selector:
app: traefik

Applying this, we’ll see

hassio% kubectl describe -n traefik service traefik-portainer 
Name: traefik-portainer
Namespace: traefik
Labels: <none>
Annotations: external-dns.alpha.kubernetes.io/hostname: portainer.engen.priv.no
external-dns.alpha.kubernetes.io/ttl: 300
ipchanger.alpha.kubernetes.io/patch: true
metallb.io/ip-allocated-from-pool: public-pool-internal
metallb.universe.tf/address-pool: public-pool-internal
Selector: app=traefikl
Type: LoadBalancer
IP Family Policy: PreferDualStack
IP Families: IPv6,IPv4
IP: fd43::7166
IPs: fd43::7166,10.43.199.112
LoadBalancer Ingress: 192.168.5.10 (VIP), 2001:db8:123:f10b:2::1 (VIP)
Port: web 80/TCP
TargetPort: 80/TCP
NodePort: web 30960/TCP
Endpoints: 10.151.253.242:80,10.151.253.237:80,[fd00::b8f3:e78:1a73:9c6e]:80 + 1 more...
Port: websecure 443/TCP
TargetPort: 443/TCP
NodePort: websecure 30316/TCP
Endpoints: 10.151.253.242:443,10.151.253.237:443,[fd00::b8f3:e78:1a73:9c6e]:443 + 1 more...
Session Affinity: None
External Traffic Policy: Local
Internal Traffic Policy: Cluster
HealthCheck NodePort: 30641
Events: <none>

As we can see, both ipv4 addresses and ipv6 addresses are assigned. As covered in my externaldns blog post, the external-dns annotation will even have the DNS be updated automatically.

Traefik reverse proxy

As previously mentioned, giving each site separate ip addresses doesn’t really give us anything in regards of separation, except by some obscurity. We still end up on the same endpoint on traefik, and could call it by any of the names it handle. There’s nothing that stops someone from configuring their hosts file to point my portainer.engen.priv.no to the IP address of vegard.blog.engen.priv.no – except of course for the Traefik Middleware Proxy with the IP allow list, it will deny the requests from external networks.

But we can do better. We can create separate traefik instances for external and internal traffic.

I have covered installing Traefik earlier, so I’ll fast forward to having a separate installation of traefik in the namespace traefik-internal, and a label on the pods app=traefik-internal, and configured my security policies exactly as for the old traefik namespace.

This is of course not enough in itself, as an IngressRoute will be exposed to both the Traefik setups. This can be fixed by the concept Ingress Class.

Ingress classes

An ingress class is more or less just a label you can use on the traefik end and on the IngressRoute end. Traefik needs to be configured to handle only that Igress CLass, and the IngressRoute needs to specify the IngressClass. I can actually do that through helm when installing Traefik:

....
# -- Create a default IngressClass for Traefik
ingressClass: # @schema additionalProperties: false
enabled: true
isDefaultClass: true
name: "traefik-internal"
....

In my traefik-internal StatefulSet, which I have created outside of Helm since the Helm chart can not specify that traefik should run in a statefulset, I can configure it to use this Ingress Class, in the startup argument to traefik.

         - "--api.insecure=true"
- "--accesslog=true"
- "--log.level=DEBUG"
- "--providers.kubernetescrd"
- "--providers.kubernetescrd.allowCrossNamespace=true"
- "--providers.kubernetescrd.ingressclass=traefik-internal"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--entrypoints.metrics.address=:9100"

Now, if I just reconfigure this without reconfiguring my portainer ingressroute, it won’t pick up the ingressroute, and will not know that it should handle portainer.engen.priv.no and give a 404 not found status when I hit it.

I need to reconfigure the ingressroute and give it an annotation specifying the ingress class:

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: portainer
namespace: traefik-internal
annotations:
kubernetes.io/ingress.class: "traefik-internal"
spec:
entryPoints:
- websecure
routes:
- match: Host(`portainer.engen.priv.no`)
kind: Rule
services:
- name: portainer
namespace: portainer
port: 9000
tls:
certResolver: letsencrypt

And that’s all we need for portainer.engen.priv.no to start working again. Almost. My network policy for portainer specifies that only pods in the namespace with the label role=ingress can access it:

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: allow-ingress-to-portainer-service
namespace: portainer
spec:
ingress:
- action: Allow
protocol: TCP
source:
namespaceSelector: role == "ingress" # Only allow from ingress namespace
destination:
selector: app.kubernetes.io/name == "portainer"
ports: [9000,9443] # Allow traffic to the Service (which forwards to 3000 internally)
selector: app.kubernetes.io/name == "portainer"

Labeling the traefik-internal namespace with role=ingress makes it work, and the network policy allows the traefik. But it also still works if I point portainer.engen.priv.no to an external facing load balancer ip address, as the external facing traefik instances haven’t been reconfigured yet. To correct this, I need to reconfigure my external facing traefik in the same way, with an external ingress class. For consistency, I have installed a traefik-external installation of traefik in a new traefik-external namespace, with a traefik-external ingress class.

Now, I just need to reconfigure all my ingressroutes to target either the internal or the external traefik – they won’t actually work until I do, so if you care about downtime, it will be good to have prepared the changes before starting the journey.

Defence in depth

There’s numerous potential holes in my reconfiguring so far:

  • I could misconfigure the ingressroute with the traefik-external ingress class and have it published to the external facing traefik instead of the internal. Now, we can again point portainer.engen.priv.no to an external facing ip address and gain access to portainer.
  • There could be a security hole in Traefik. Even though Traefik in itself doesn’t know to handle portainer.engen.priv.no, the process and the container can still access the port, and if a malicious hacker gains access to the container through a security hole in traefik, he could potentially access the portainer service from there.
  • I could misconfigure the firewall and allow Internet traffic towards the internal DMZ.
  • I could misconfigure the load balancer service and have it be published in the external DMZ instead if in the external DMZ.

The concept of defence in depth is always good to have in mind. Let’s create some more safety measures, protecting both against my own mistakes and against malicious actors.

We can begin with changing the label on the traefik-internal and traefik-external namespaces and make them different. Let’s call them role=ingress-internal and role=ingress-external. We can now target this in the security policy:

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: allow-ingress-to-portainer-service
namespace: portainer
spec:
ingress:
- action: Allow
protocol: TCP
source:
namespaceSelector: role == "ingress-internal" # Only allow from ingress-internal namespace
destination:
selector: app.kubernetes.io/name == "portainer"
ports: [9000,9443] # Allow traffic to the Service (which forwards to 3000 internally)
selector: app.kubernetes.io/name == "portainer

Now, the traefik-external PODs can no longer send traffic to the portainer, so that hole is plugged, both against malicious actors targeting my external traefiks, and against one of my possible mistakes.

But I can still point an load balancer in the external DMZ towards my internal traefik instances by accident, by specifying the wrong metallb pool. We can protect against that too, by changing the ingress security policy of my traefik-internal namespace:

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: allow-ingress-to-traefik-service
namespace: traefik-internal
spec:
ingress:
- action: Allow
protocol: TCP
source:
nets:
- 192.168.0.0/16
destination:
selector: app == "traefik-internal"
ports: [80,443]
- action: Allow
protocol: TCP
source:
nets:
- 2001:db8:123:f100::/56
destination:
selector: app == "traefik-internal"
ports: [80,443]
selector: app == "traefik-internal"

The IPv6 network is my ISP-assigned full range, I might want to limit it if I have internal networks I don’t fully trust, but this will do. 192.168.0.0/16 is where I put my VLAN networks. I don’t use all of it, but it’s safe enough, as 192.168.0.0/16 is not routable over Internet, so it has to come from my internal network. I might want to limit it to only some networks, then I just specify different ranges. This actually translates to iptables and ip6tables rules on the underlying node, and mixing ipv4 and ipv6 in the same action doesn’t work.

With this setup, I am as secure as I can be on the OS and Kubernetes level, but as I started saying, it’s never going to be as secure as separate hardware infrastructure. For me, that doesn’t run much mission critial stuff at home, it’s more than good enough – but it’s a fun exercise nevertheless!

I’ll end this one with a little bonus change. I am still publishing records for my internal service to the external DNS. That’s an unnecessary potential information leak. So let’s try fix it.

External and internal DNS with external-dns

With Unifi, I can create DNS records in the Unifi Gateway itself. Since I point DNS internally to the Unifi Gateway, these records will be checked before Unifi checks with public DNS. If I only configure the records there and not in my Linode, they won’t be resolvable from the Internet at all.

I could do it manually, but that’s boring and error-prone. Let’s try have external-dns manage two different DNS providers, Linode and my own Unifi gateway.

To achieve this, I need to have two installations of external-dns, targeting different providers. Turns out there is a provider for external-dns with Unifi.

There is a good installation guide for it. If I only install it according to that, I will however get double up with DNS entries, both externally and internally. I also need to install it in a different namespace or name it differently with changes in helm values. Then I need to make sure I pick up only the external-dns annotations I want. I do this by setting an extra annotation on the properties I already have external-dns annotations:

If I want to publish it internally, I annotate it with

external-dns/internal: true

If I want to publish it externally:

external-dns/external: true

THen, I need to reconfigure the two external-dns installations to only act if this annotation is present. I can do that in helm, by adding to extraArgs:

# -- Extra arguments to provide to _ExternalDNS_.
extraArgs:
- '--request-timeout=60s' # tried with and without this
- '--ignore-ingress-tls-spec'
- '--traefik-disable-legacy'
- '--annotation-filter=external-dns/internal=true'

This one was for the internal-facing, I do the same for the external-facing.

And this is all there is, now the two external-dns installations will happily handle only those targeting their DNS provider through this annotation.

Summary

Separating external facing and internal facing services is considered good practise, unless you want to show off your openness by revealing your setups publically. This is how far I get with my modest one-node Kubernetes cluster. Now, I can easily target either the world or the internal network by labeling my properties correctly, so that:

  • Traefik picks up Ingressroutes for either external or internal facing services
  • Load balancers are created in the correct MetalLB pool/DMZ
  • DNS records are created either externally or internally.

With correct security policies on the namespace where the services run, we have a quite tight setup, making sure external traffic does not end up in my internal services.

Hopefully, you have learnt as much as I did by reading this blog post as I did by researching for i, I am still a newbie when it comes to Kubernetes.

, , , ,

2 kommentarer til “Kubernetes at Home: Internal and external services”

    • Takk for tilbakemeldingen!

      Lykke til! Og du – hvis du finner feil eller ting som kan gjøres bedre, legg gjerne inn en kommentar!

      Som jeg har skrevet – jeg er en nybegynner i Kubernetes, satte opp clusteret mitt bittelitt før min første Kubernetes-post!

Legg igjen en kommentar

Din e-postadresse vil ikke bli publisert. Obligatoriske felt er merket med *

This site uses Akismet to reduce spam. Learn how your comment data is processed.