Many computer users take hard drive reliability for granted, not
even thinking about minimal possibility of drive crash. They suppose
that hard disk drive manufacturers have done a great improvement to
their products towards disk reliability. And they have, but the reality
is that all disks die eventually. Even if you have a recent backup,
sudden disk failure is a minor catastrophe. How can we protect
ourselves from a sudden hard drive crash? One of the ways is through
SMART (Self-Monitoring Analysis and Reporting Technology) by predicting
future failures.
The essential moment is that the user should
understand how drives fail and why. There are two classes of failures
the hard disk can suffer: unpredictable and predictable.
Unpredictable
failures happen suddenly, without warning and can be caused by
catastrophic events, handling damage, static electricity or an
electronic component burning out, and there is nothing that can be done
to foresee or stay away from them.
Predictable ones are 60%
mechanical and occur gradually over time. The degradation of drive
performance may include head crashes, head contamination, bad solder
joints, bad curcuit board connections, motor break down, worn down
bearing, spinning inability, excessive run out, bad servo positioning.
Most
hard drives lose their performance slowly, and the disk is able to
monitor and diagnose many elements’ condition through SMART, providing
an early warning for many types of problems. When a potential problem
is detected, the drive can be repaired or replaced before any loss of
data.
This technology has developed to be industry standard for
drive manufacturers and allows checking hard drive status, reporting it
and providing some estimation for future failure date. SMART has been
able to predict a gradual degradation of the disk. The original SMART
spec (SFF-8035i) was written by a group of disk drive manufacturers. In
1995, parts of SFF-8035i were merged into the ATA-3 standard. Starting
with the ATA-4 standard, the requirement for the disks to maintain an
internal Attribute table was dropped. Instead, now, the disks simply
return an OK or NOT OK response to an inquiry about their health. A
negative response indicates that the disk firmware has determined that
the disk is likely to fail. The ATA-5 standard added an ATA error log
and commands to run disk self-tests to the SMART command set.
Self-Monitoring,
Analysis and Reporting Technology systems (SMART) are built in to most
modern ATA and SCSI hard disks. SMART disk drives internally monitor
their own health and performance. SMART technology features include a
set of attributes, which determine reliability-prediction parameters of
drive and should not be exceeded under normal operation. Each attribute
has an identification number (ID). Some types of reliability parameters
are:
- Distance between the heads and the disk platters;
- Faulty sectors;
- Recalibration;
- Drive spin-up time;
- Drive temperature;
- Characteristics of the media;
- Motor and servomechanisms.
Attribute
value is a positive integral number, usually in the range from 1 to
253. Initially, all attributes have maximum values. A value of 100 or
200 will often be chosen as the "normal" value. Some attributes are
considered life-critical and others are just "informative". In case of
hard drive wearing or when some components of the disk are about to
fail, attributes indicate decreasing amount of values. Consequently,
high values determine high reliability of the drive and low values –
low reliability or high possibility of drive failure. Specific
threshold is assigned to each attribute. Once the value drops below
this threshold, SMART considers the disk to be faulty, which means it
becomes very dangerous to store data on this drive.
Currently,
the SMART system can detect about 70% of all hard drive errors. Its
main shortcoming is that it doesn't provide a direct mechanism for
informing the OS or the user if problems are found. In fact, because
disk SMART status is frequently not monitored, many disk problems go
undetected until they lead to a catastrophic failure.
Monitoring
a drive's behavior, SMART has the purpose of warning a user about the
threat of drive collapse while time remains to take preventive action,
such as back up the data to a replacement device. So why not use SMART
monitor programs freely available on Internet to cut these problems off
at the pass?
Vital Data Recovery - Montreal, Quebec, Canada (http://www.vitaldata.ca).
Vital Data is data recovery company, which specializes in recovering
data from hard disk drives and other media. Vital Data Recovery offers
the most technologically advanced hard drive data recovery available.