VPS Disaster Recovery SLA

Bigwetfish - Disaster Recovery SLA for VPS Clients

Hardware faults are thankfully rare but this document explains what we will do and in what time frame we will do it should a server be down due to a hardware fault

All our VPS Nodes are built with INTEL Xeon Processors and RAID10 disk arrays with Adaptec Raid Cards.  Faults are rare but this document will document what you can expect from us should a node develop a hardware fault.

All the content in this document is based on the ‘best effort’ principle and if there is any reason we are not going to meet these time frames we will let clients know as soon as practically possible via the email address we hold on file for the client.


Scenario One:  Fixable Hardware Fault (no data loss)

Where a piece of Hardware develops a fault and that fault is not disk related so that the server can be restarted after repair (such as a faulty power supply or bad RAM) we do keep spare parts in the data centre.  There are technicians on duty 24/7 in the data centre and there is a 1-8 hour SLA for the replacement of faulty hardware.  We host with DIMEnoc in Kent and their standard SLA is 1-8 hours for hardware replacement.   Historically we have found the data centre are closer to the lower end of that time frame when fixing faults as have used this company for many years and have built up excellent relationships.

Therefore we endeavour to have the hardware fixed in a maximum of 8 hours and following a reboot your server should come back on line.  Please note as with all Linux servers a reboot may cause the server to require an fsck (file system check) and that can take anything from 30 minutes to a few hours.


Scenario Two:  Data Loss Scenario and client has bought CP Remote Backups

In this scenario our technicians on the ground will have been working on the server since it went off line  to determine the fault and if they tell us the hardware fault will result in data loss then we need to enter a ‘server restore’ scenario where the VPS clients on the failed node need to have their servers restored.  The restore process is as follows once we start the restore work for a client:

  • Deploy new server for client (15 minutes)
  • Install CentOS Operating System (45 minutes)
  • Install Cpanel Control Panel (60 minutes)
  • Harden Canel (60 minutes)
  • Configure Cpanel and recompile php (40 minutes)
  • Change name server Ips to point to new server (10 minutes)
  • Package up backups in batches of ten accounts (approx 30 minutes per batch)
  • Copy backups from backup server to new server (approx 30 minutes per batch)
  • Restore backups on new server (approx 30 minutes per batch)

We promise best effort in this event as we will have an average of 9 clients needing restored and technicians will work from different screens to manage these as efficiently as possible.


Scenario Three:  Data Loss Scenario and client has a Backup Server

In this scenario our technicians on the ground will have been working on the server since it went off line to determine the fault and they tell us the hardware fault will result in data loss and we need to enter a ‘server restore’ scenario.  Some clients rent a separate VPS server with lower specifications (less RAM etc) as a backup server.  In this case we have a server pre-built and a copy of their accounts reside on this server already.  There will already have been a daily data sync script running on the server that will ensure the accounts on the backup server are up to date with data.  The process is as follows if the user wants to switch to their backup server and once we start the work it will take this time frame:

  • We increase the RAM on the backup server and reboot it (15 minutes)
  • We license Cpanel on the backup server (5 minutes)
  • We change the nameserver IPs to point to the new server (10 minutes)

As the server will already have been pre-built and will have up to date data on it the sites will start working again within an hour for some people and certainly within a few hours for most people once the new Name Server IPs propagate.  This assumes the VPS client uses bwfdns.com name servers.  If the client uses custom name servers the client will need to make the name server Ip change.

We will then check the backup server against the main server backup to see if any new accounts have been added to the live server that are not on the backup server since the last check by the client and any new accounts can be restored on a domain by domain basis.


Scenario Four:  Data Loss Scenario and client has not purchased backup option

In this scenario our technicians on the ground will have been working on the server since it went off line to determine the fault and they tell us the hardware fault will result in data loss and we need to enter a ‘server restore’ scenario.  Some clients despite our strong recommendation do not spend the money on a backup solution.  In this situation we will build a new server for the affected client and provide it to them as it was the day they bought it so they can restore from local backups.  Our staff can assist with the restoration from local backups but it is only fair to the clients who have purchased backups that they will have their restore tickets handled first.  Once all our clients with backups are restored staff will work diligently to assist those clients without backups to restore their local backups.

  • Email, SSL
  • 4 Users Found This Useful
Was this answer helpful?

Related Articles

 Account Restores

If you are on a shared hosting account and delete a file or database in error or need something restored the first port of call will be to open a helpdesk ticket.  We usually have at least one...

 Compromised Account Recovery

Hacked accounts are rare and usually can be traced to out of date scripts or plugins on your hosting account.  Please always keep your scripts and plugins up to date at all times.If you do discover...

 Shared Server Disaster Recovery: Feb 2013

We had a staff meeting yesterday to discuss our 'Disaster Recovery' procedures for shared serversWe thought we would publish our notes from the meeting as some of you may be interested in reading...