Wednesday, May 26, 2010

BadBlocks Hard Drive Validation or Destructive Wipe

Badblocks is an excellent tool to use to test the endurance of a hard drive. Nothing is worse than getting a new or old drive and setting up your entire working environment only to have the drive die soon after you are finished. I have worked with thousands of drives and whether they are new or old, drives will die.Google ran a study of all the drives they own and they found that a drive is going to die in either the first 3 months of use or after 3 years of use regardless of temperature or duty cycle. This is not a hard fast rule, but something to remember. Whether the drive is new or old it will fail and will normally do so at the most inopportune time. Think of the company mail server going down on a Friday at 5pm after you just spent all day Thursday setting it up.
What is needed is a simple test to work the drive for hours and see if it is stable or not. This badblocks tutorial will show you the basics and you can also find a very simple shell script at the end to help you with multiple drives. I suggest using a dedicated box in a back room with the sole purpose of testing your stock of drives.
You can choose to either do a non-destructive test to make sure the disk is working or you can use badblocks to wipe the disk of all data.
The destructive test is especially useful when you are getting rid of the disk or returning it to the manufacture for some reason. The destructive test is good for drives you wish to keep and in the process of deciding if it is good enough to keep and put into storage for future use. If the drive is going into shelf storage then run the destructive test. If the drive is good then you at least know you have a good spare. If the drive is bad then you can check if the unit is still in manufacture warranty and, if it is, this is the perfect time to send it back to the manufacture.


Is badblocks available for OpenBSD??
Yes, it is. You will need to install the package called "e2fsprogs". The command "pkg_add -i e2fsprogs" should suffice. You will then find the binary installed at /usr/local/sbin/badblocks .


Option 1: NON-destructive BadBlock test

First we will consider a NON-Destructive test. This will run through every sector of the drive and read the data. It will _not_ over write the data in any way.
The following line is for a NON-destructive BadBlock test in a machine with at least 1 gigabyte of ram in machine. There is a single drive on the first ide chain.
NON-destructive BadBlock test (1gig ram in machine)
  badblocks -b 4096 -c 98304 -p 0 -s /dev/hda

With this test we will be using a block size "-b" of 4096 bytes and 98304 blocks at once "-c" to test the drive. Using 1 gig of ram in our machine allows us to use a high number of blocks "-c" and thus speed up our drive test significantly. The argument "-p" is the number of passes to run. By default we will be running one pass ( -p 0 is run once and stop). The last argument "-s" is the device we are targeting.


Option 2: Destructive BadBlock test

Lets say we need to completely wipe the disk. Badblocks can be instructed to run a destructive drive test. This will also run though every sector of the drive, but instead of just reading the data, this test will overwrite all data. Badblocks writes a set of known pattens to the disk and then reads those pattens comparing the two. Errors will be printed to the screen if the test encounters any.
The following line is for a _DESTRUCTIVE_ BadBlock test in a machine with at least 1 gigabyte of ram in machine. There is a single drive on the first ide chain.
Destructive BadBlock test (1gig ram in machine)
  badblocks -b 4096 -c 98304 -p 0 -w -s /dev/hda

With this test we will be using a block size "-b" of 4096 bytes and 98304 blocks at once "-c" to test the drive. Using 1 gig of ram in our machine allows us to use a high number of blocks "-c" and thus speed up our drive test significantly. The argument "-p" is the number of passes to run. By default we will be running one pass ( -p 0 is run once and stop). The flag "-w" is the write flag meaning we will be over writing all data on the disk. The last argument "-s" is the device we are targeting.


How about a script?

To make our lives easier we can use scripts. This is a dead simple shell script to test two(2) drives on the first ide chain. Note: if you only have one drive then just attache the drive as hda and the script will die when nothing is found on hdb.
#### badblocks_test.sh
clear 
echo " "
echo "Starting BadBlocks tests on /dev/hda and /dev/hdb ..."
echo " "
echo "drive test started on... /dev/hda"
time badblocks -b 4096 -c 98304 -w -p 0 -s /dev/hda
echo " "
echo "drive test started on... /dev/hdb"
time badblocks -b 4096 -c 98304 -w -p 0 -s /dev/hdb
echo " "
echo "badblocks tests finished."


Questions you may have:
How long will the test run?
On a Pentium 4 3GHz machine with one(1) gigabyte of ram and a single Seagate 80 gigabyte drive connected to a PATA 100 interface badblocks in the script above will take around two(2) hours to complete. A 250 gigabyte Maxtor drive connected to PATA 100 interface will take 9 hours and a 750 gigabyte Seagate drive connected to a SATA 133 interface will take 13 hours.
What if I have a SATA connected drive?
SATA connected drives are normally recognized as a scsi device. Scsi devices are known as /dev/sd? instead of PATA drives which are /dev/hd?. So if your SATA drive is the first on the SATA bus then it might be /dev/sda .
What if i want to do a DOD (Department of Defense) style wipe?
A DOD wipe is a lot more ridged than badblocks will normally do. If you want to do a DOD wipe then increase the pass argument "-p" to no lower than eight(8). The argument would be "-p 8"
What if I only want to wipe just one(1) partition?
If you have multiple partitions on a drive and you only want to wipe or test one partition then you need to tell badblocks which partition to target. Lets say the partition is the second one on the first PATA drive on the chain. You would replace the device "/dev/hda" with "/dev/hda2".

No comments: