XEN140 Issues

  • Friday, 8th November, 2019
  • 09:19am
Friday 8 November 2019 - 4.00pm
Server 30 has 30 more accounts to restore.  Server 18 we are now processing the final wave of accounts manually and as these are the largest accounts on the server they are taking some time.  We're sorry once again for this hardware failure.  Thanks to everyone for your extended patience.

Friday 8 November 2019 - 1.50pm
A new server30 has been created and we are now in the process of restoring the data and the accounts to the server. We are starting with the smallest accounts first in order to bring as many people back online as quickly as possible.  In case you do not want to read the thread server 30 also resided on the failed node.

Friday 8 November 2019 - 1.30pm
Server 18 has remained stable since we have moved it to the new hardware.  We're still working through the restores though so we thank you for your extended patience if your website is still down.  We've got the vast majority of sites restored but are working through the final list now.  We anticipate that all restores will be completed by 8pm this evening and we are completing this systematically once account at a time.  The only remaining accounts are the larger ones that do by the nature of their size take longer to restore. 

Friday 8 November 2019 - 10.30am
We have brought Server 18 back on line on it's new node and moved the IP across.  Our team are now working through the domain names to test websites as the new node we have defaulted to PHP 7.3 to future proof things. Please note if your domain was one of the largest on the server it is still restoring and we sincerely apologise.  We restore accounts in size order so over 90% of accounts are now restored.

Friday 8 November 2019 - 10am
We're currently working on removing the server 18 IP from the old machine and adding it to the new machine.  The restore is at 85% so not all websites will come back on line.  We restore in size order so the smallest accounts are already restored.  Largest will complete one at a time as the monrning progresses.  We'll have an update at 11am.

Friday 8 November 2019 - 9am

Yesterday we saw high load on Server 18 and we saw a lot of traffic being directed against WordPress login pages from a very diverse range of IP addresses.  This has caused the server to become unresponsive.  At the same time we saw slow performance on the main node RAID array despite the array presenting as healthy and all disks showing no problems.  Our senior admins were working on this issue for many hours and at 10pm the decision was taken to deploy one of our hot spare servers and build a new server 18.  That was completed and at 2am the restore has started.  Unfortunately as this is shared web hosting it's common for many clients to not maintain their sites properly and many sires have very large disk usage (many have old backups dating back to 2014) so this is delaying the restore.  We aim to bring the new server 18 back on line at 10am with most sites recovered.

At the same time server 30 which resides on the same physical node is also showing signs of strain.  It's a much smaller server.  For your information Server 30 (despite the name being higher than server 18) is our oldest server and is deployed for those people who need really end of life php versions.  We're hopeful once we move server 18 off and we move the IP address it will give server 30 the breathing space it needs but we are prepping some plans for server 30 as well in the background in case we need them.

We will continue to post updates here regularly throughout today.
« Back