We use Enterprise drives, we do. We use Enterprise drives because… you know, we are an enterprise. But there has to be more to it than that surely? Of Course there is – these drives are designed to take a pounding – or in other words we value our data, we recognise our data is our livelihood, we value engineer time and dislike disruption to services – it makes sense. It may come as no surprise but there IS a difference – or rather we do SEE a difference in terms of failure rates. What follows is a tale of how we got here, and the lessons learned.
A Google paper on the failure rate of rotational drives was all over the internet a decade ago – there are an equal number of hard facts these days highlighting specific designs, manufacturers and models being more prone to failure than others… these are equally hard to argue with, month after month.
While the rules of the game have changed, the reality is still very much the same – some drives are just not up to the job when pushed to the limits. The issues were easy enough to visualise with rotational drives, where speed, accuracy and heat dissipation were issues that required mechanical solutions – precision parts, quality of moving parts, and so on. Now SSD (Solid State Drives) are the new kids in town, and for some applications we have seen premature failure rates for certain models where the wear rate has crippled non enterprise drives.
The use of RAID arrays gives you a line of defence in terms of resilience to failure – this assumes that it doesn’t suffer multiple failures within a short time scale. This can happen, this does happen. A degraded array is obviously slower, a rebuilding array is slower still. Pick your poison, faster rebuild rates, or faster operation. Furthermore, arrays are no replacement for good backup policy and procedure. When a host ceases to function you have two things to consider – how long from NOW the last backup was taken, and how long from NOW that last backup will be back online again (RPO and RTO respectively). Data still takes a while to read, process, decompress, move, write. Having a backup is golden, however you really don’t want to have to have to use it, and if you do – the more you use, the longer the gap between last backup and return of data (MTTR mean time to recovery).
It is for reasons like that our Hosting UK dedicated servers sport enterprise grade SSD’s* and a minimum configuration of RAID1 (a mirror – two drives with the same data on each). The same is true for shared email, web hosting, database servers and so on. Not all shared or dedicated servers are created equally as you will have noticed – for example we only use Dell Enterprise hardware – something else to keep in mind when speaking to your current provider is the drives… are they up to the job? That and backups. Always the backups 😉
No one likes failures. No one likes making good afterwards. Even less people like them when they are avoidable. Enterprise for a reason.
*If you have an application that you believe will be particularly drive intensive, and are looking to deploy SSD’s – do let us know – we can advise on the best tools for the job.