LXD with Linux bridges in a VMware environment – the right way

When using LXD containers in a VMware VM, you might want to toss away the default NAT configuration and switch to a more useful setup with bridges attached to the containers.

Unfortunately, this config does not works by default because the vSwitch is blocking all the network traffic coming from other MAC addresses than the ones of the vNICs assigned to the VM.
There is a known workaround that involve configuring the vSwitch in promiscuous mode, but apart from being a security issue, it seems like it won’t works well in some scenarios.

Do we have to abandon the dream of having VMs directly exposed to the external network? No way! The solution is as abandon the bridged mode and use vNIC directly assigned to the container.

Yes, I know it sounds weird, but LXD containers are usually deployed for thick-containers, pet-style scenarios… and, if you are skilled enough, you can always automate vNIC attach/detach and container creation/destruction.
Attaching the network interface to the container is trivial:

lxc config device add $ContainerName eth0 nic nictype=physical parent=$vNIC name=eth0

Have fun!

Network management with LXD and OpenVSwitch in Ubuntu 18.04

Just some quick notes about my homelab setup of LXD 3.0 with OpenVSwitch (OVS) in Ubuntu 18.04.

Why use OVS in a small homelab environment? Because it’s the most used SDN stack in the world, and you should learn it instead of relying on the traditional Linux bridges, especially if you are into virtualization/containerization or networking stuff.
Ever heard of whitebox switches? They are going to be the dominant platform in the hyperscaler datacenters… and maybe, also in the enterprise market.

A big warning: this is not a “best practices” configuration, the one with overlay switch and tunnel switch as shown in this great article, but I’m still working on that and this simpler one should be ok for some non-enterprise playground.

As of today (today is the first day of Ubuntu 18.04!) Netplan does not directly support OVS, so don’t even try to use it; I hope they will fix it soon, but for now just don’t configure your OVS NIC with Netplan and please fallback to traditional configuration scripts. Or even to manual startup, they are servers and they are meant to be always on anyway… or not? (Thinking of MaaS)

I strongly suggest you to use a machine with multiple NICs, it will make anything a lot easier because you would not be kicked away from the network when adding your only NIC to the OVS bridge.

Just to begin, install OVS with apt install openvswitch-common openvswitch-switch and check the status of ovs-vswitchd.service and  ovsdb-server.service. Don’t forget to enable the ability of kernel to forward packets with the usual echo “net.ipv4.ip_forward = 1” >> /etc/sysctl.conf, followed by sysctl -p /etc/sysctl.conf to reload the config.

After that, do not create any switch in the initial lxd init configuration. Just create an OVS switch in LXD with the command lxc network create ovs-1 bridge.driver=openvswitch. It will automatically be added both to the LXD network profiles and to the OVS configuration, that you can check with ovs-vsctl show. That’s cool! Now it’s time to bind the physical interface to our ovs-1 switch; remember that this will KILL any connection established on the NIC that you choose, so be careful.

After choosing your NIC to bind (list them with the ip link command), type ovs-vsctl add-port ovs-1 eno4; eno4 is my fourth NIC, of course. Now it’s time to apply the network profile we just created to the default profile (but you can choose another one, of course) with lxc network attach-profile ovs-1 default eth0. This way, the first NIC of your LXD container will be a veth port on the OVS switch ovs-1. Start a container in LXD with lxc launch ubuntu:18.04 and check if you got everything right with ovs-ofctl show ovs-1; some veth-stuff should appear. Now, log into your container and play with it’s network configuration: it should appear like it’s on the same L2 switch of the physical NIC eno4.

What you can do now? Easy VLAN tagging, for example: ovs-vsctl set port vethM3WY7X tag=200. Don’t forget to set the switch port physically connected with eno4 as a trunk for the VLAN tag that you choose.
You can also create NIC aliases and bind different OVS switches with different tags to the in the very same NIC, but I have not experimented that yet.

Paranoia, XenServer and libvirt: Full Disk Encryption unlocking from virtual serial TTY

Everybody wants encryption today; encryption of network traffic, mainly.
But what about the encryption of data-at-rest? I mean, what if someone got physical access to your storage?
Of course you can protect specific folders with your tools of choice, but there’s always a chance that some file got saved outside your security fence by a zealous program, or just forgotten by you on the desktop… here comes the Full Disk Encryption (shortly, FDE) to the rescue!

Maybe you are  already using it without being aware of that… if BitLocker on Windows and FileVault on Mac OS X sound familiar, you are on track.

The penguin within

But, what about our beloved Linux machines?
In the Linux world, FDE is mainly achieved through LUKS; this software is capable of encrypting your disks with AESXTS using both passphrases and key files, usually stored in an USB drive and plugged on-demand.

I won’t tell you how to setup LUKS in your machine because you should better refer to the installation manual of your distro.

If you plan to use it with a workstation, you’ll be set already manually inserting the password or the USB drive on every boot.

And my VMs?

But, what if you want to use LUKS in a virtual environment, in a hosted VM?

I won’t talk about acrobatic USB-passthrough, focusing instead on the passphrase method.

Of  course, you can manually type it in your preferred hypervisor VM-console; but because having a very long and random passphrase is of utmost importance, writing a 40 characters long on every server reboot can be extremely tedious and can easily bring to bad habits, like short passphrases, avoiding necessary update-reboot cycles or dumping the LUKS thing at all.

Meeting the serial (killer?)

The best way of input long keys would be having it stored in encrypted keyrings with the possibility to copy-and-paste them into the VM prompt as needed, but the graphical consoles AFAIK don’t support input methods like that, and of course we cannot use SSH&co. because there isn’t any service running in the VM before boot.

There’s another way to access the VM at “physical” level, at least with hypervisors like KVM and XEN: the glorious and vintage serial console!

Maybe you have already heard of it for router configuration; shortly, it was the leading way of talking to an operating system out-of-band before the video output we use today.

Any Linux distribution provide this capability; to enable it in a CentOS 7 guest, you just need to modify your /etc/default/grub to make it looks like this:


GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_TERMINAL="console serial"
GRUB_SERIAL_COMMAND="serial --speed=115200 --unit=0 --word=8 --parity=no --stop=1"
GRUB_CMDLINE_LINUX_DEFAULT="console=tty1 console=ttyS0,115200"
GRUB_CMDLINE_LINUX="rd.lvm.lv=DONT_CHANGE_IT!/root rd.luks.uuid=DONT_CHANGE_IT! rd.lvm.lv=DONT_CHANGE_IT!/s
wap rhgb"

and rebuild your initrd with grub2-mkconfig -o /boot/grub2/grub.cfg.
Take a snapshot before doing any modification to your grub.cfg, you have been warned!

Now reboot your system and everything should work as before, except for…

Practical unlocking

Yes, now there is an active serial interface that can be used to access your VM from the hypervisor, in a completely textually way.
Really, you can stop typing that super-long random string one-character-at-time right now!

The thing is, how to access it from the virtualization host?

  • If you are using libvirt-virsh framework, you are lucky: just type virsh console VMname and you are set! The exit-escape is CTRL-].
  • If you are on XenServer with XAPI, use the xl console VMname console (same escape). Dump that xe console command, it won’t help you if you are targeting an HVM guest; it is very likely that you are on HVM, you should now if you still are on PV-mode. The xl command solution is undocumented in the XenServer doc, and it’s the actual reason why I wrote this post. 

Oh, XL is also the new default toolstack for XEN management!

Now you should be set and smooth, go and enjoy your very-long and very-random LUKS passphrases!

XenServer 7 software RAID with mail alert

Leggi questo articolo anche in italiano.

XenServer 7 is emerging as the reference implementation of the opensource hypervisor Xen, proving advanced features regarding storage, networking, HA and management; all packaged in a self-contained, enterprise-level bundle, often superior to VMware vSphere, HyperV and KVM, the other big players of the enterprise virtualization market.

It should also be noticed that Xen is the hypervisor of choice for the majority of public clouds (AWS and the others), is completely FLOSS  software and its XenServer declination is managed directly from the Linux Foundation (no CITRIX anymore), making it the de-facto standard solution for the bare metal opensource virtualization.

Powerful batteries included!

The software stack that surrounds the Xen kernel has made a giant leap thanks to the adoption of a modified CentOS 7 as the dom0 in XenServer 7.
The standard installation of XS already include every piece of software needed to completely skip an hardware RAID solution and adopt an enterprise-ready software RAID solution! As of today, MDADM is one of the few RAID solution that can manage NVMe RAID or other niche configurations; thanks to the modern CPU power, MDADM (especially with very big arrays) plays on the same ground of some ASICS-based controller.

Furthermore, software RAID enable the creation of array using non-branded disks, a strategic feature of XS7 in comparison to products like VMware, that at most permits the creation of a RAIN using paid add-ons like vSAN. If you have ever compared the price of branded HDD or SSD  (HP, IBM, Dell, ecc.) with the COTS ones, you know what I’m talking about.

Dive into MDADM

This is the to-do list you have to follow in order to build a software RAID that will warn you in the event of malfunction:

  • Creation on an MDADM array;
  • Configure SSMTP to send email warnings;
  • Test if email alerts are working;
  • Add the array as an SR to XS7.

MDADM is a very battle-tested piece of software; included in the Linux kernel since many years, has earned on the field its reputation of very stable and bug-free software; indeed it is used in big critical system (like the “big irons x86”) and in the majority of commercial NAS system (Synology, QNAP, Buffalo, ecc).

Because is already included in the standard installation on XS7, the creation of a RAID array is very simple, especially in the typical scenario in which you have installed the hypervisor in a USB drive that won’t be part of the array… as recommended by the best practices! I have spoke about that here (Warning, italian inside™! Let me know if you want to read it in english).

For instance, create a RAID 10 with the first four HDDs is as simple as

 mdadm --create /dev/md0 --run --level=10 --raid-devices=4 /dev/sd[a,b,c,d] 

To monitor the disk syncing, just do

 watch cat /proc/mdstat 

or the evergreen

 mdadm --detail /dev/md0

that will also give you various information about the status of the array.

Would you trust a system that does not warn you in the event of failure?

We are going to set the array monitoring; just to begin, copy the active array configuration in the config file of MDADM with

 mdadm --verbose --detail --scan >> /etc/mdadm.conf 

This will enable the array monitoring by the MDADM daemon.
Now, configure the SSMTP to send email warnings… just as an example, this configuration of /etc/ssmtp/ssmtp.conf is usable to send notification from a gmail address:


It’s important to set /etc/ssmtp/revaliases like that:


Eventually, the MAILADDR and MAILFROM parameters from the last rows of /etc/mdadm.conf will need to be assigned to your sender and receiver addresses, so mdadm.conf will look like that:

ARRAY /dev/md0 level=raid10 num-devices=4 metadata=1.2 name=your_hostname:0 UUID=UUID_ARRAY
MAILADDR destination_mail@provider.sth
MAILFROM your_mail@gmail.com

Don’t leave your chair without some testing!

Now that the configuration is completed, you need to do some checks; MDADM can send a test email with

 mdadm --monitor --scan --test --oneshot 

But, if you really want, you can put a disk in “failed” state (be aware of rebuilding times with big HDD) using

 mdadm --manage --set-faulty /dev/mdo /dev/sdb mdadm --manage /dev/md0 --add /dev/sdb 

You have to wait a little for the fail detection, as far as ten minutes, because the daemon is correctly “lazy” at recognizing the fail, even if it is simulated.

Now that the array is tested and ready to be used, just add it to the XS7 SR:

 xe sr-create content-type=user device-config:device=/dev/md0  name-label="Local RAID10" shared=false type=lvm 

In XS7, the array is recognized even after a reboot without manual module insertion or other configurations; everything is just ready to go!