The latest information on the issue is being provided below by the staff in the data centre working on this issue. Our staff on live chat do not know any more information and we will update this announcement the moment we have some more information.
Update 18/11/2024 at 81m
All VMs have remained stable since the last update and subsequently all VMs have been moved to another node out of caution as there are still some Operating System errors in the old node logs indicating there was corruption caused when the RAID card initially failed.
Update at 4pm
As we type this from the system admins office at the data centre all Virtual Machines are on line. The server came on line and remained stable for a while after the chassis swap as 12 noon but we were seeing many Operating System errors which indicates there was corruption caused when the initial RAID card failed earlier today. We had to take the server off line again to perform some more maintenance. Out of caution we completely replaced the RAM (as we had initially moved the RAM from the failed node) and another new Chassis was provisioned and we fixed some Kernel errors that we were seeing in logs.
The server has now been on line for some time and appears to be stable. We are still seeing some errors in logs that concern us but the node is on line and servers are up. We are 'hot migrating' the virtual machines to a hot spare box to move them off this node and that should be a permanent resolution as we cannot leave a server on line that is spewing errors to the logs.
We appreciate only a very small handful of clients are affected (8 servers in total) but we also fully realise for these clients this outage has been much more sporadic as we would have liked. We are confident we are nearly there and as we are using the hot migrade feature in our cloud the servers should remain on line as we move them.
Saturday 16 November 2024 - 12.31pm
The server network connectivity has been fixed after the chassis swap. The Node is back on line and the virtual machines are booting. A few are on line and others are coming on line one at a time.
Saturday 16 November 2024 - 11.26am
We are still working on this the server is booted up but has no network connectivity after the chassis swap.
Saturday 16 November 2024 - 10.39am
The server is showing the same error with a new RAID card. We are currently swapping the chassis. Please be assured we want to do all we can to bring the VMs back on line without the need to start builds and restores. We appreciate your extended patience here.
Saturday 16 November 2024 - 9.54am
The server is not booting with a specific RAID error. We're replacing the RAID card now to attempt a reboot after importing the RAID configuration. Once we know more we will post it here.
Saturday 16 November 2024 - 9.17am
The team have arrived at the data centre and are currently removing the hardware from the rack to investigate. We will post an update as we have it.
Saturday 16 November 2024 - 8.48am
We are aware that a node has failed in our data centre. The team are seeing an error that means we need to go to the data centre. A staff member is on their way now to investigate this and please be assured this is getting top priority.