The "Drive Health Indicators" are noting more and nothing less than the specific SMART parameters the manufacturer of the drive chose to have the drive report. The ID number and the parameter name are specified in the SMART standards and as you can see there are at least 199 possible parameters in the standard. Every manufacturer has the latitude to choose which parameters to collect and report and more importantly what the failure level is for each parameter. The problem with that is manufacturers tend to set the limits such that by the time SMART reports a problem the drive has already failed.

With the advent of SSDs that connect to the system and report SMART values through the SATA bus there were additional parameters added that gave a somewhat better picture if SSD health. Newer NVMe (Non-Volatile Memory express) drives appear to have settled on a standardized set of parameters that to seem to me to tell the tale pretty succinctly..

NOTE: Numbered categories are mine and not part of the standard.
  1. Drive Identification
    • PCI Vendor ID
    • Subsystem Vendor ID
    • Controller ID
    • Namespace ID
    • NVMe Interface Revision
  2. Statistics
    • Unsafe Shutdowns
    • Controller Busy Times
    • Max Data Transfer Size
  3. Early warning data
    • % of available spares
    • The available spare threshold — set by vendor
    • % used
    • % available
  4. Ultimate lifetime indicators
    • Data Units Read
    • Data Units Written
    • Host read commands
    • Host Write commands
  5. Has the drive failed?
    • Available spare space below threshold: Y/N
    • Over Temperature threshold: Y/N
    • NVM Subsystem Reliability degraded: Y/N
    • Media in Read-Only Mode: Y/N
    • Volatile Memory Backup Device Failure: Y/N
As I see it if any of the last five items is "Y" the drive should be replaced ASAP if not sooner.


If we knew what it was we were doing, it wouldn't be called research, would it?

— Albert Einstein