Nash Soneta

Wednesday 7 December 2011


How to interpret S.M.A.R.T. data

Some attributes are flagged to be performance related, while other ones are related to the actual fitness of the drive. Some other attributes have no special relationship. It's up to the manufacturer to set flags and thresholds accordingly. Attribute values can range from 1 to 253. 0, 254 and 255 are invalid and should not be used. 253 is the highest value an attribute can assume and 100 is the initial value for any attribute prior to any data collection. Let's have a look at a sample report from a S.M.A.R.T. enabled hard disk:
  • Attribute id is 4 ("Start/stop count")
  • Value is 253
  • Worst value is 253
  • Threshold is 0
  • Raw value is 1324
Since the threshold was set by the device manufacturer to 0, it means this is an informational attribute. The raw value indicates how many times the hard disk was started and stopped. The value is set to 253, which means that the health related to this attribute is at its best, and the worst value set to 253 states that the drive was always reported to be healthy.



Now let's look at another attribute:
  • Attribute id is 5 ("Reallocated sector count")
  • Value is 253
  • Worst value is 253
  • Threshold is 63
  • Raw value is 0
This time, the threshold value shows that this attribute is strictly related to device reliability. If the value for this attribute reaches 63 or an even lower value, the drive is expected to fail soon. This is obvious, as modern hard disks do include an area with spare sectors that are normally unused, but where bad sectors can be transparently remapped when found. The amount of spare sectors is fixed and while they become less and less with new bad sectors being detected, this attribute is updated. The raw value shows the number of reallocated sectors. When they are 0, no bad sector was found and needed remapping. When this value is higher, some bad sectors were discovered. While the raw value is still low, there is no real threat to the hard disk reliability, but when that number grows, we should seriously consider a replacement for the drive. This all will be reflected by the synthetic value associated to this attribute. In this example, its value is 253, which means that everything is working perfectly when coming to reallocated sectors.

Now let's look at another sample for the same attribute, but from a different drive:
  • Attribute id is 5 ("Reallocated sector count")
  • Value is 85
  • Worst value is 85
  • Threshold is 63
  • Raw value is 37
This time the value is 85, which is less than 100 and even less than 253. This means that this attribute is not in perfect shape. Since the manufacturer set this threshold to 63, we can still assume the drive is working properly and will still work properly in the (near) future. Because of the nature of this attribute, the raw value is easy to decipher and we can try to infere something by reading it too. Keep in mind that this manufacturer decided that to a raw value of 37 corresponds a value of 85. Some other manufacturer might use different numbers. Since most IDE drives include 512 spare sectors, we can try to figure out how bad the situation is, but we should remember that this is something that is not directly stated by the S.M.A.R.T. data. In order to be sure that the raw value actually represents the number of reallocated sectors, we should read the product manual for the drive and we should do the same to be sure about the 512 value. What can be read from S.M.A.R.T. data is that the attribute whose id is 5 (we need not to know that it actually represents "Reallocated sector count") has a direct influence over reliability (we understand this because the threshold value is higher than 0), that it is somewhat degraded or not at its best (because its current value is lower than both 100 and 253) and that it is not failing. This example helps us to understand that the actual meaning for the threshold is not to show something that already failed, but something that is about to fail. If we assume that spare sectors are 512 and that the raw value represents the number of spare sectors currently used to remap bad sectors, we might expect to read a value that equals the threshold when the raw value reports, say, 300. This means that several bad sectors were spotted and that the drive manufacturer considers this as a significant evidence of a hard disk that is about to fail.

No comments:

Post a Comment