Does "dd" actually re-map obviously bad sectors (same as Terminal > Erase > Zero Out Data) in addition to flagging their existence? (As far as I know, TTP, et al don't.)
Please clarify "By the time the drive reports the error to you, the sector is dead, dead, dead."
How do TechTool, et al differ from "dd"? (It sounds like you're saying that SpinRite is the only utility that actually deals with bad blocks as they should be dealt with [however that is].)
dd doesn't actually re-map anything. It just reads the data. It's the drive itself that does re-mapping, if any. If it remaps a sector, it does it transparently, without telling your computer that it did so. Your computer never knows which sectors have been remapped.
The data as written on disk does not look anything like what you think it looks like. If you write all zeroes to a disk sector, what actually gets written might look like 376937693769.... That is, your data will be remapped to a "group code", so that there are never too many zeroes or ones in sequence. (It's hard for the drive to tell the difference between 20 zeroes in a row and 21 in a row. In fact, it probably has a hard time distinguishing between 3 zeroes in a row and 4 in a row. So it never writes more than 3 in a row.)
The data is also accompanied by a rather large number of parity bits, enough to recover data that was read incorrectly. I won't go into the mathematics of error correction. It's fascinating math, but the margin of this comment is to narrow to contain it. A perhaps more understandable analogy is spelling correction. Forum posts not infrequently contain speling and grammatical errors; when you read them, you usually take the errors in stride. Even if you notice them, you know what was meant, and bothering to post a correction just adds noise to the thread.
The disk drive does the same. It uses sophisticated math, just like you use sophisticated linguistics, to figure out what was meant when it notices an error. Its main concern is giving you the corrected data as quickly as possible. Taking the time to write the correction back to disk would slow the drive down considerably. There is a lot of error correction going on. Disk drive designers have always been pushing the limits of the technology to squeeze as much data onto the drive as possible. That drives up the raw error rate. To compensate, error correction technology keeps pace, so that most of the many many errors are corrected on the fly, and you never know about them.
If a sector goes bad, the drive can relocate it to a spare sector, but spare sectors are few and precious. The drive won't relocate a sector lightly. It's got to be really bad to warrant using up a spare.
When the drive sees an error, it will first try to correct it. Parity bits may allow it to detect and correct any error that is limited to within, say, 20 consecutive bits. (Surface defects tend to produces errors in bursts, and the correction algorithm is optimized for that.) If the error can be corrected, well and good. The drive returns the corrected data, and ignores the error.
If it can't correct the error, it'll try re-reading it. It may get different data on the next attempt, and it might this time be correctable. Again, the drive returns the corrected data and ignores the error. In the meantime, though, the platter has had to make another revolution, and your read speed goes down considerably. You wonder why the drive seems to be so slow now, but there is no other indication that anything is amiss.
The drive will keep re-reading the sector until it gets it right. (Drives don't actually read sector by sector anymore. They actually read an entire track at a time, apply error correction to the track as a whole, and then give you sectors from the corrected data. I'll continue to talk about reading sectors, though.) There will be some number of read attempts that will trigger the drive to take more drastic action. Suppose it can eventually read the sector, but it took more than, say, 5 tries. At this point, the drive may attempt re-writing the sector with the finally corrected data. This is going to take two more revolutions of the disk: one to re-write the corrected data, and one to re-read what it just wrote.
If it can successfully re-read what it re-wrote, it considers the problem solved, and moves on.
Only if it tries re-writing the data and discovers that it still cannot re-read it will it consider the sector bad. It'll relocate that sector, writing the correct data to the spare sector (and taking another revolution to do so). I repeat: errors are common, spare sectors are scarce. The drive won't use up a spare sector just because of an easily corrected error.
Whatever software was asking for the data remains oblivious to all of this, except possibly to notice that the read took an unusually long time. It gets the correct data, and moves on.
But it may be that after some larger number of re-reads, the drive has still been unable to read enough of the sector to successfully correct it. After 99 attempts, say, the drive may give up. Only then does it report a read error back to the computer. By this time, whatever data was originally written to the sector should be considered lost; even with error correction, the drive is unable to reproduce the data. That's what I meant by:
"By the time the drive reports the error to you, the sector is dead, dead, dead."I cannot speak for what TechTool etc. do. Disk drives contain internal preferences that are accessed using different commands than the commands that read and write data. SpinRite has several modes of operation. In some modes, it uses these commands to tell the drive "Report any errors, even if you can correct them. (But give me the corrected data if you can.)" In other modes, it tells the drive "Make one attempt to read the track. Give me the raw data, including the parity bits. I'll do my own error correction." In the latter mode, it may re-read the track multiple times, watching how even correctable errors move around to discern exactly where the defects are. Comparing multiple reads of the raw data, it can correct what the drive cannot: errors that are not limited to a single burst. In other modes, once it's able to read the data, it will write it back but complemented (changing 0s to 1s and vice versa), verify that it can read it, then write it back again uncomplemented, verifying that it can still read it back. (Some defects manifest themselves as "stuck" bits. They might always read back as 0 even if 1 was written. Writing zeroes won't detect such defects. Of course the proper way to do this is to choose data that complements what's actually written. If an all-zero sector actually gets written as 3769..., you need to choose data that will actually get written as c8a6... . That can require intimate knowledge of the drive, potentially more than is knowable.)
I know this has been a long post. If you're still reading, a relevant example from another topic of discourse might help.
One episode of Star Trek featured aliens with green skin. What looks green to the eye may not look green to the camera, so they'd paint the actors green, film them, and then look at the developed images. The skin tones came out normal; no hint of green. So they'd try another green paint. And another. Every time, the green failed to show up on film. Finally they talked to the technicians doing the post-production, who complained: "I don't what you guys are doing with lighting, but for some reason the aliens are all coming out green, and we're having a heck of a time getting them back to flesh tones."
The point is, auto-correction can get in your way. What you see is not always what was there to see. A sector may read OK, but that doesn't mean it is OK.