XenServer 7 software RAID with mail alert

Leggi questo articolo anche in italiano.

XenServer 7 is emerging as the reference implementation of the opensource hypervisor Xen, proving advanced features regarding storage, networking, HA and management; all packaged in a self-contained, enterprise-level bundle, often superior to VMware vSphere, HyperV and KVM, the other big players of the enterprise virtualization market.

It should also be noticed that Xen is the hypervisor of choice for the majority of public clouds (AWS and the others), is completely FLOSS  software and its XenServer declination is managed directly from the Linux Foundation (no CITRIX anymore), making it the de-facto standard solution for the bare metal opensource virtualization.

Powerful batteries included!

The software stack that surrounds the Xen kernel has made a giant leap thanks to the adoption of a modified CentOS 7 as the dom0 in XenServer 7.
The standard installation of XS already include every piece of software needed to completely skip an hardware RAID solution and adopt an enterprise-ready software RAID solution! As of today, MDADM is one of the few RAID solution that can manage NVMe RAID or other niche configurations; thanks to the modern CPU power, MDADM (especially with very big arrays) plays on the same ground of some ASICS-based controller.

Furthermore, software RAID enable the creation of array using non-branded disks, a strategic feature of XS7 in comparison to products like VMware, that at most permits the creation of a RAIN using paid add-ons like vSAN. If you have ever compared the price of branded HDD or SSD  (HP, IBM, Dell, ecc.) with the COTS ones, you know what I’m talking about.

Dive into MDADM

This is the to-do list you have to follow in order to build a software RAID that will warn you in the event of malfunction:

  • Creation on an MDADM array;
  • Configure SSMTP to send email warnings;
  • Test if email alerts are working;
  • Add the array as an SR to XS7.

MDADM is a very battle-tested piece of software; included in the Linux kernel since many years, has earned on the field its reputation of very stable and bug-free software; indeed it is used in big critical system (like the “big irons x86”) and in the majority of commercial NAS system (Synology, QNAP, Buffalo, ecc).

Because is already included in the standard installation on XS7, the creation of a RAID array is very simple, especially in the typical scenario in which you have installed the hypervisor in a USB drive that won’t be part of the array… as recommended by the best practices! I have spoke about that here (Warning, italian inside™! Let me know if you want to read it in english).

For instance, create a RAID 10 with the first four HDDs is as simple as

 mdadm --create /dev/md0 --run --level=10 --raid-devices=4 /dev/sd[a,b,c,d] 

To monitor the disk syncing, just do

 watch cat /proc/mdstat 

or the evergreen

 mdadm --detail /dev/md0

that will also give you various information about the status of the array.

Would you trust a system that does not warn you in the event of failure?

We are going to set the array monitoring; just to begin, copy the active array configuration in the config file of MDADM with

 mdadm --verbose --detail --scan >> /etc/mdadm.conf 

This will enable the array monitoring by the MDADM daemon.
Now, configure the SSMTP to send email warnings… just as an example, this configuration of /etc/ssmtp/ssmtp.conf is usable to send notification from a gmail address:


It’s important to set /etc/ssmtp/revaliases like that:


Eventually, the MAILADDR and MAILFROM parameters from the last rows of /etc/mdadm.conf will need to be assigned to your sender and receiver addresses, so mdadm.conf will look like that:

ARRAY /dev/md0 level=raid10 num-devices=4 metadata=1.2 name=your_hostname:0 UUID=UUID_ARRAY
MAILADDR destination_mail@provider.sth
MAILFROM your_mail@gmail.com

Don’t leave your chair without some testing!

Now that the configuration is completed, you need to do some checks; MDADM can send a test email with

 mdadm --monitor --scan --test --oneshot 

But, if you really want, you can put a disk in “failed” state (be aware of rebuilding times with big HDD) using

 mdadm --manage --set-faulty /dev/mdo /dev/sdb mdadm --manage /dev/md0 --add /dev/sdb 

You have to wait a little for the fail detection, as far as ten minutes, because the daemon is correctly “lazy” at recognizing the fail, even if it is simulated.

Now that the array is tested and ready to be used, just add it to the XS7 SR:

 xe sr-create content-type=user device-config:device=/dev/md0  name-label="Local RAID10" shared=false type=lvm 

In XS7, the array is recognized even after a reboot without manual module insertion or other configurations; everything is just ready to go!