Docker Networking Part 3 – removing the unintended escape routes.


At the end of the part two I showed that all the docker networks I had created were, in fact, bridge interfaces, which would bridge traffic out of docker. When the interface, that lives on the outside of docker, also has an ip address in that network, I can connect to services that listens to all network interfaces on the docker host through this ip address. This needs to stop!

There’s a couple of solutions, of course.

Choosing how to isolate your docker nets

There’s a few ways to do this, probably even more than the below methods, but I’ll quickly go through my evaluated choices.

Firewalling on the docker host.

The underlying docker host likely has a builtin firewall that can be configured to not allow traffic from the docker networks. I quickly ruled out this one as too error-prone, and might also not work in all situations. The docker default network type is also still inherently a bridge interface, so firewalling might not catch all the traffic you want it to catch, so I didn’t dive down this route.

Not adding the ip address to the bridge interface

You can have docker not add the ip address to the bridge interface. This plugs the trivial hole of just trying to contact the default gateway, but it’s still inherently a bridge, so it does not have much isolation at level 2 at all. Nevertheless, it’s pretty simple to set up:

networks:
driver: bridge
driver_opts:
com.docker.network.bridge.inhibit_ipv4: true

Macvlan on a dummy interface

I have used macvlan to connect a docker network directly to an underlying interface. This works very well when you want to extend an external interface into docker, and is what I have been using for my external facing DMZ zone. If you instead of using a real interface use a dummy interface, you’ll not be able to communicate with anything over that interface at all. You’ll need to create the interfaces outside of docker, though, but the advantage to that is that you have more control over naming, i.e. I can name the interface for my wordpress network wp01. Configuration in /etc/network/interfaces for this might be

auto wp01
iface wp01 inet manual
pre-up ip link add wp01 type dummy
post-down ip link del wp01

Configuration of the network in docker would be just the following (remember that I have already opted to hard code the ip address ranges)

networks:
wordpress_default:
driver: macvlan
driver_opts:
parent: wp01
ipam:
config:
- subnet: "10.100.5.0/24"
ip_range: "10.100.5.0/24"
gateway: "10.100.5.1"

Ipvlan l2 (on a dummy interface)

There’s another type of vlan, called ipvlan, that is a bit less resource intensive (but probably not enough to let this be a decisive factor). You’d likely want ipvlan l2, which is the default. Ipvlan can be attached to a pre-created interface, like macvlan, and you could thus name those interfaces on the host side to something meaningful.

If you don’t specify an underlying interface, docker will create a dummy interface for you, so the configuration of this is a bit simpler and more dynamic. However, supposedly the level of isolation is a bit less with ipvlan.

The configuration is again pretty simple

networks:
wordpress_default:
driver: ipvlan
ipam:
config:
- subnet: "10.100.5.0/24"
ip_range: "10.100.5.0/24"
gateway: "10.100.5.1"

It support attaching to a precreated interface if you use

networks:
wordpress_default:
driver: ipvlan
driver_opts:
parent: wp01

I had some challenges with routing with ipvlan – and still haven’t figured out this completely. Sine macvlan has better isolation and provides for neatly named parent dummy interfaces, I decided not to follow that track any further for now.

My choice: macvlan

macvlan gives pretty good isolation, and makes for neatly named interfaces. One advantage compared to bridge interfaces is that there’s a ton of internally created interfaces that will show up on the list of interfaces on the host, while with this one, it will just be the ones you create and can name.

Too tight?

Having achieved this, you have closed down the loophole that the docker container can just break out to the host network. But what if they actually need to access something of the internet? You might decide to leave it as bridge interfaces in that case, but I opted for another tagged interface with macvlan, which I configured in my infrastructure docker-compose setup.

networks:
dockerdmz:
driver: macvlan
driver_opts:
parent: dockerdmz
macvlan_mode: private
enable_ipv6: true
ipam:
config:
- subnet: "192.168.28.0/24"
ip_range: "192.168.28.0/24"
gateway: "192.168.28.1"
- subnet: "2a01:799:393:f10a::/64"
ip_range: "2a01:799:393:f10a::/64"
gateway: "2a01:799:393:f10a::1"

Note the extra macvlan_mode option. It makes hosts on this dmz isolated from each other, they can only talk to the gateway. The underlying dockerdmz is a tagged macvlan interface

ip link add link eno1 name dockerdmz type vlan id 13

I have of course also added this network to my external network and configured the gateway. Now, I can control what the various containers I decide to connect to this network can access, by updating the firewall of my gateway.

In the docker-compose of my services, I need to add the external network

networks:
infrastructure_dockerdmz:
external: true

And in the services, I will specify

services:
service: wordpress
....
networks:
infrastructure_dockerdmz:
ipv4_address: "192.168.28.12"
ipv6_address: "2a01:799:393:f10a::3:1"

Intentionally reaching the host from docker – what do you do?

This eventually comes up to a few personal choices and also the traffic pattern.

In part 1 I created a bastion host, that reaches the host on a macvlan in top of the ethernet interface. I also run home assistant in docker, and there might be services on the host it needs to reach, in addition to other services on the network.

I opted for running all this traffic out through the gateway. If this had been significant bandwith consumed, I might have reconsidered, as sending 100 Mbps out and then in again on a 1 Gbps interface would amount to 200 Mbps extra bandwith used on that interface instead of just getting it internally on the host. In my case, I expect the bastion host just being used for interactive sessions, and home assistant isn’t going to consume much bandwith on a few polls here and there. For good measure I gave home assistant it’s own dmz:

    ha01:
id: 12
link: eno1

Then I have more control over what type of traffic home assistant can send. I could easily just have used the dockerdmz I created, though.

For my bastion, I just removed the connection to the dmz_firewall network, making it use the real ip address of the server out again on the DMZ interface and in to the server. Of course I needed to open that in the firewall, as I have pretty tight control over what reaches my server.

Then I removed the previously configured macvlan directly on my ethernet interface, closing down that remaining way to reach directly to my host operating system from docker.

Summary and other considerations

As of writing this, I can not see any glaring holes in my setup – everything is locked up pretty tight, and nothing can really break out from docker and directly to my docker host, which was what I set out to achieve.

Is there better ways to do this? Sure! Using proxmox or other virtualization tools might have been better – but I like the eacosystem around docker – especially docker images being readily available, making it extremely easy just to spin up something that is nominally good configured. Pulling an updated docker image and just restarting the service makes it a breeze to keep things updated. I also like the volume approach to keep persistent data. In fact, I am so happy with my setup there that I might even decide to do a separate blog post about how I run my dockerized services at home!

Was this worthwhile? Absolutely! This now resembles a proper server network – all within docker! I have done some hardening, and made sure things are pretty isolated from each other.

Network diagram

My drawing skills are poor, but I’ll try draw a diagram of what I have created. I have more services running than wordpress and homeassistant, and there’s a few extra containers both in the home assistant network and in the wordpress network, but at least this is conceptually correct. A circle is a network, a square is a container.


Legg igjen en kommentar

Din e-postadresse vil ikke bli publisert. Obligatoriske felt er merket med *

This site uses Akismet to reduce spam. Learn how your comment data is processed.