Resource icon

Hard Drive Burn-In Testing - Discussion Thread

D

Deleted47050

Guest
TLDR
For Drives 6 terabytes or bigger use the following command
badblocks -b 4096 -ws /dev/sdaX

That command would work for any hard drive with block size 4096, irrespective of its size. I have successfully used it on the latest 3 TB and 8 TB drives I have tested with this method.
 

Fattrain

Dabbler
Joined
Jul 22, 2013
Messages
23
Hey guys, just wanted to throw out a huge thank you for this thread.

This is my first time adding another pool of drives to my FreeNAS & I really wanted to stress test them first but wasn't sure how.

Using this guide I was able to run through all the tests with ZERO errors in about 3 days on 5x3tb WD Red drives.

THANKS AGAIN!
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Been running this for the past couple of days. Gotta say this was very helpful and really loving tmux. Got 12 windows opened in putty and running the gambit.

Only question, I have is if there is a viable way to detect the appropriate hard drive block size? In this server all of the 12 drives are Hitachi Ultrastar HUA723020ALA641 2TB. When I ran badblocks, I used "-b 4096". Not sure if I should have or not, but too late now.

Couple observations:
  1. "smartctl -t conveyance" is not supported for these drives, so even though I tried the command it may be worth mentioning that not all drives support this and should not be construed as a failure
  2. "inappropriate ioctl for device" message may be seen during the badblocks. As others have found out, this can be safely ignored. Would be nice to have that noted in the original posting
  3. tmux: while CTRL + B; then " is listed to make new windows; I found out that after so many new windows I was unable to add more due to it not being able to fit in the putty session. I discovered that CTRL + B; then % would add a vertical window. Using both of these, I was able to make 12 windows (3 columns and 4 rows) to get it all to fit.
So far things are looking good with 0 errors, hopefully that is the case end the end.

Thanks for a great write-up!
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
I'm here with a question also. Pardon me if these are elementary.

I had a stable "production" system, but I never finished my burn-in. So, I made backups of all data on my FreeNAS system, and right now I am running destructive tests on my 4 drives, via SSH. So, right now the system is no longer a "production" system.

These drives were part of a live pool, with a CIFS share.

Once these tests are done, assuming the drives pass, what is going to happen next? Will my system still be live, just without data? Or, will my pool be lost? What errors will I see, and where?

I assume that I'll need to start over and create a new pool.

Edit: the 4 WD Red 3TB drives are 9% through in 28 minutes and I can still access the share via Windows.
 

Attachments

  • badblocks.png
    badblocks.png
    14.5 KB · Views: 424
Last edited:

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
And a followup: what happens if you forget to enable the kernel geometry debug flags before running destructive badblocks - badblocks will throw an error?
 
Joined
Apr 9, 2015
Messages
1,258
Without the debug flags my understanding is that it will not work the entire drive. Can't remember the exact section it will skip over but it will miss small space.

Not sure about the pool, if you didn't destroy it before starting the information could still be in the settings but I doubt it would work correctly even if it was. Though all data written to the drives will be trashed.
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
It is tough diving into FreeNAS.

Amongst my friends & family I am an IT genius. My qualifications:

- As a kid in the early 90's, my dad bought old PC's from the classified ads, forcing me to learn BASIC, DOS. I used to transcribe games from magazines into our Tandy PC. We always had the worst computers, and I learned a lot.
- In High School I won 1st place in the Future Business Leaders club "Computer Science" competition, for all of New Jersey.
- I worked in IT / Tech support in college (Fordham University) for 3 years.
- Built & maintained many PC's through the years.
- In general I am a bit geeky.

But on these boards I am an IT nightmare! Ha.

FreeNAS is not for the faint of heart. In my spare time I enjoy learning the basics of FreeBSD, Unix, etc. But it is tough to find the time without working in the field.
 
Last edited:

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Without the debug flags my understanding is that it will not work the entire drive. Can't remember the exact section it will skip over but it will miss small space.

I believe that it will skip the MBR of the drive unless that is set. Not 100% sure but thinking I saw that somewhere on the forums...

But it is tough to force yourself to learn without working in the field.

Nah, a true geek does it because its fun and a hobby. I just got lucky when I realized that there were companies that actually wanted me pay me for my hobby. ;)
 
Joined
Apr 9, 2015
Messages
1,258
Nah, a true geek does it because its fun and a hobby. I just got lucky when I realized that there were companies that actually wanted me pay me for my hobby. ;)

Yeah, it's a little bit of a pain to learn some stuff but it's like anything else the more you practice the better you get. I used to do a whole lot more, I was one of the people that instead of installing windows 95, 98, or ME from a disc I would copy everything over to a folder and install from there. Never had to hunt for the disc when you made a change to the system.
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
Code:
2930266489
2930266490
2930266491
done
Testing with pattern 0x55: done
Reading and comparing:  18.35% done, 21:53:55 elapsed. (0/0/240616 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq2930266489
2930266490
2930266491
done
Testing with pattern 0x55: done
Reading and comparing:  20.78% done, 21:53:38 elapsed. (0/0/239088 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
2930266489
2930266490
2930266491
done
Testing with pattern 0x55: done
Reading and comparing:  4.96% done, 21:53:31 elapsed. (0/0/249164 errors)
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
2930266489
2930266490
2930266491
done
Testing with pattern 0x55:  96.78% done, 21:53:23 elapsed. (0/0/256400 errors)


4 drives; 22 hours into testing my 3TB WD Red's and I'm seeing these compare errors. Any ideas? My production system prior to running badblocks was running perfectly, and the drives were purchased from unique vendors to get drives from different batches.
 
Last edited:
Joined
Apr 9, 2015
Messages
1,258
Bad drive is my guess. Brand new ones will not always pass. When it's done do a smart long test and see if it passes or fails. And I would print off the summary of badblocks when it's done as well. Then RMA the drive with printouts attached.
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
Bad drive is my guess. Brand new ones will not always pass. When it's done do a smart long test and see if it passes or fails. And I would print off the summary of badblocks when it's done as well. Then RMA the drive with printouts attached.
Yes, but 4 new drives from 2 different vendors? The odds are slim.
 
Last edited:
Joined
Apr 9, 2015
Messages
1,258
If all four drives are getting the same errors then it could be a controller or something else, possibly loose cables. But it still boils down to running a long smart test afterwards to verify.

And hate to say it but it could also be user error. Someone else will have to weigh in and make a guess on what could go wrong there.
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
Controller, it is a possibility, but again, it seems unlikely with quality hardware, which I have. Also, note that I had this running as a production/test system for almost a year, with no issues.

As part of my testing, in order to test data integrity in the test environment, I calculated the checksums for each file, appended to a log. Months later I tested that same nearly 5TB of against their checksums with no errors. This should have been an unnecessary test due to ZFS but I made the verification regardless.

Everything that I've read says that any and all errors are unacceptable during badblocks testing, let alone getting back to the following SMART long tests afterwards. However, I will let badblocks run its course, then a SMART long test, and will report back.

It absolutely could be user error, hence my posting here.
 

alheim

Dabbler
Joined
Nov 19, 2014
Messages
22
The drives were part of an existing volume. I deleted the volume and restarted badblocks -ws on each drive.
 

Revolution

Dabbler
Joined
Sep 8, 2015
Messages
39
Is it bad to run a badblocks test on two new drives (not in the pool) on a live system with an existing pool? Since you have to set this explicit flag I'm not sure if it's ok.
 

Revolution

Dabbler
Joined
Sep 8, 2015
Messages
39
Do you mean the two new drives? They are not in use right, but the main pool is in use. So I don't set the flag, start the badblocks test and it should run fine right? I just want to be very safe.
 

sirjorj

Dabbler
Joined
Jun 13, 2015
Messages
42
In the instructions on the first post, the first three steps are short, conveyance, and long. After short and conveyance, you explicitly say to wait until the test is done before continuing. After long, you do not. Is it okay to start badblocks while the long test is running or do we wait until long is done?
 
Top