25. January 2022

BIG DISASTER @ irgNET Datacenter

By H. Cemre Günay

Well I think the title is telling a lot. Unfortunately, in my main server the RAID 5 array has drifted apart due to 2 defective HDDs.

First I got the information from iLO Advanced, the OnBoard Management solution of HPE, that one hard disk failed. Then I went to Datacenter 01 and exchanged the defective disk for a new one. During the rebuild, another disk failed and my RAID 5 array shot apart. This resulted a data loss and looked like this in the ESXi Host:

As every company should have a functional backup, I could also rely myself on my backup. As a backup solution, I use Veeam, a fairly simple solution that runs as a VM on ESX 04. (Fortunately I thought)

So I connected to my backup server via RDP and selected the restore function of the Veeam appliance. When I tried to select and restore the respective VMs from the backup, I got an error message that the backup repository was corrupt, which is a 2TB WD Black USB 3.0 disk. (see Home Lab Stage V, 5.0 U2)

A total disaster, I not only lost my datastore but also my backup.

What do we learn from this? Everyone should keep an eye on their backup environment and run restore tests every now and then, so that there are no surprises in moments like this.

I am currently in the process of restoring my entire environment by hand. Fortunately, my blog and some other services were running here in Datacenter 02 and I lost the infrastructure services but not other applications such as;

  • HomeBridge for home automation
  • Unifi Controller for my access points,
  • The backup server,
  • My reverse proxy
  • The NextCloud.

Infrastructure services I have lost are;

  • VMware vCenter
  • VMware Horizon and the associated VDIs
  • VMware vRealize
  • AD
  • DNS
  • Backup Proxy
  • Ansible VMs.

This loss hurt quite a bit and will take time to rebuild everything. As you know the connection between Datacenter 01 and Datacenter 02 runs via PowerLine Adapter. Rebuilding would take a lot of time, so I removed ESX 03 (my standalone server with infrastructure services) from Datacenter 01 and moved it to Datacenter 02 so it would be connected via a switch and not via PowerLine adapters:

First, we build a new backup server as a VM and mount the external hard disk. Then we back up the currently existing VMs/applications, such as this blog to prevent more disasters. The next step is to roll out the VMware vCenter appliance again and so on.

Well then, I’ll get to work! 😀