Resource icon

Hard Drive Burn-In Testing - Discussion Thread

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I imagine that's fairly unusual, and the fact that the log shows a failed test should be grounds for RMA, regardless of the values of the various SMART attributes.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
It's unusual, but I've seen it before. When I saw it the hard drives were overheating (50C+) and failed SMART, but once they cooled down they passed without errors. Of course, they all had much shortened drives, over 50% of them failed over the next 6 months or so. :(

Most companies won't do an RMA if it has had a subsequent pass for a SMART test.
 

JoanTheSpark

Dabbler
Joined
May 11, 2015
Messages
14
Looks like something is writing to the pool, which is no surprise (e.g. FreeNAS does frequent small writes to whichever pool houses the syslog). This will make it very hard to interpret the results of backblocks, so you're probably best off killing this run, detaching the pool, and starting over.

That needs to be in the OP please, right before the section which talks about badblocks - detach pool/volume from freenas before running badblocks test.

Also the times for smartctl -t short and smartctl -t conveyance are the wrong way around? - short takes 2 mins and conv takes 5 mins on my system.

Couldn't hurt to add right after the smartctl -t short section, that to see the results of the run one needs to use smartctl -a.

All considered though - big thanks for the HowTo :rolleyes:
 
Last edited:

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
  1. The thread title is "Hard Drive Burn-In Testing", which should be a pretty good clue that you don't run it on a live pool.
  2. The 1st post says, "THIS TEST WILL DESTROY ANY DATA ON THE DISK SO ONLY RUN THIS ON A NEW DISK WITHOUT DATA ON IT OR BACK UP ANY DATA FIRST".
Is a warning about detaching the pool really necessary?
Also the times for smartctl -t short and smartctl -t conveyance are the wrong way around?
They do look wrong, but maybe conveyance is shorter for some drives. Seems unlikely tho...
 

Ruff.Hi

Patron
Joined
Apr 21, 2015
Messages
271
  1. The thread title is "Hard Drive Burn-In Testing", which should be a pretty good clue that you don't run it on a live pool.
  2. The 1st post says, "THIS TEST WILL DESTROY ANY DATA ON THE DISK SO ONLY RUN THIS ON A NEW DISK WITHOUT DATA ON IT OR BACK UP ANY DATA FIRST".
Is a warning about detaching the pool really necessary?

I read the thread title. I also read the DESTROY section. As I was only 'testing' my system and the data on it was crap / duplicate, I didn't care if I destroyed it. So ... I, for one, didn't equate these to 'don't run on a live pool'.

That said, I wouldn't say that a warning was required ... just a short note that says something like 'this post assumes you have detached your pool'. You might even want to mention that running it on a live pool will result in error reports from badblocks and from freeNAS.

BTW - you can do this test with a live pool and not lose your pool data. Say you have 5 disks in RAIDZ2, detach pool, test 1 HDD, connect pool, FreeNAS will say system is degraded, 'replace / reconnect' the degraded HDD (the one you tested) and FreeNAS will rebuild your pool. Then rinse, repeat for the other 4 drives. It takes 5 times as long ... but you shouldn't lose your data.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
you can do this test with a live pool and not lose your pool data. Say you have 5 disks in RAIDZ2, detach pool, test 1 HDD, connect pool, FreeNAS will say system is degraded, 'replace / reconnect' the degraded HDD (the one you tested) and FreeNAS will rebuild your pool.
For that matter, you can run the non-destructive badblocks test if you want. Contrary to the badblocks man page and the information in this thread, it's actually faster than the (default) destructive test, though arguably not quite as thorough.
 

JoanTheSpark

Dabbler
Joined
May 11, 2015
Messages
14
  1. The thread title is "Hard Drive Burn-In Testing", which should be a pretty good clue that you don't run it on a live pool.
  2. The 1st post says, "THIS TEST WILL DESTROY ANY DATA ON THE DISK SO ONLY RUN THIS ON A NEW DISK WITHOUT DATA ON IT OR BACK UP ANY DATA FIRST".
Is a warning about detaching the pool really necessary?

They do look wrong, but maybe conveyance is shorter for some drives. Seems unlikely tho...

having empty disks in a pool created by the wizard and without data on it doesn't equal to me, that badblocks will collect errors caused by the freenas OS using those very same drives as logging storage during those tests.. no.
It's just that the title implies this to be a 'newbie guide' from a newbie for newbies.. I found the info about what was going on by reading all over the forums and in this very thread, but the OP itself said he wanted to put this into 1 concise spot for people to do this without reading for hours on end as he went through this himself ;-)

PS: I'm sorry that with success and demand your niche product FreeNAS goes mainstream and some less shell-savvy people like me turn up and take noob-how-to guides at face value - expect more of us and even worse than me - and I don't mean that as a threat.
 
Last edited:
Joined
Sep 13, 2014
Messages
149
I'm just about to setup my first FreeNAS server and I'm currently in the process of testing my disks. So far I've ran the Short, Conveyance and Long S.M.A.R.T. tests and my 8x 3TB Red's all passed with flying colours. I'm at the stage where the next step is to enable the debug flags before running badblocks. I have a couple of questions first though.

1. Is it recommended that I use FreeNAS to perform the badblocks tests / is the code below FreeNAS and BSD based OS's only or will it work on Linux based OS's (i.e. can I use my Ubuntu system to run the tests)?

2. What does the code below actually do? Am I right in saying that it enables you to read/write and thus test areas of the disk that are normally off limits?

3. What are the risks involved? I keep on reading that enable the user to "Shoot themselves in the foot".

Code:
sysctl kern.geom.debugflags=0x10


Sorry if these are somewhat basic questions. I just want to make sure that I'm doing everything right and understanding as much of what I'm doing as I can.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I'm just about to setup my first FreeNAS server and I'm currently in the process of testing my disks. So far I've ran the Short, Conveyance and Long S.M.A.R.T. tests and my 8x 3TB Red's all passed with flying colours. I'm at the stage where the next step is to enable the debug flags before running badblocks. I have a couple of questions first though.

1. Is it recommended that I use FreeNAS to perform the badblocks tests / is the code below FreeNAS and BSD based OS's only or will it work on Linux based OS's (i.e. can I use my Ubuntu system to run the tests)?

2. What does the code below actually do? Am I right in saying that it enables you to read/write and thus test areas of the disk that are normally off limits?

3. What are the risks involved? I keep on reading that enable the user to "Shoot themselves in the foot".

Code:
sysctl kern.geom.debugflags=0x10


Sorry if these are somewhat basic questions. I just want to make sure that I'm doing everything right and understanding as much of what I'm doing as I can.
There are areas of a disk that are protected and you are not allowed to write to directly. When using the device the way badblocks does it needs direct disk access so it can write to every sector. For testing reasons there really isn't any risk involved. It just means you will overwrite any data on the disk and there are no safety checks.

Code:
0x10 (allow foot shooting)
  Allow writing to Rank 1 providers. This would, for example, allow the super-user to overwrite the MBR on the root disk or write random sectors elsewhere to a mounted disk. The implications are obvious.
 
Joined
Sep 13, 2014
Messages
149
There are areas of a disk that are protected and you are not allowed to write to directly. When using the device the way badblocks does it needs direct disk access so it can write to every sector. For testing reasons there really isn't any risk involved. It just means you will overwrite any data on the disk and there are no safety checks.

Code:
0x10 (allow foot shooting)
  Allow writing to Rank 1 providers. This would, for example, allow the super-user to overwrite the MBR on the root disk or write random sectors elsewhere to a mounted disk. The implications are obvious.

Thanks for the quick and succinct reply.

The disks are all brand new, they haven't even been formatted, so data loss is a non-issue. Still it's reassuring to know that any risk is minimal.


Yes, you can, and it's what I did because I have an Ubuntu box available.

That's good to know. My server is in my bedroom for the next few weeks (until it's moved to another room) but I have an Ubuntu box downstairs, it'll mean that I'll be able to get started right away.
 

attilahooper

Dabbler
Joined
Jul 4, 2014
Messages
10
Newb here, I'd like to say thanks to qwertymodo. And how come this isn't stickied ? The Building, burnin thread is great and all, but lacks a LOT of detail. Not complaining - just saying this is a valuable thread.
Anywho,
I have #4- 3TB Greens doing badblocks for about60 hours now. Two look to be complete, tmux panes can be a little dodgy it seems, and (0/0/0) looks to be latest result but won't know till I smartctl them I guess.
Also,
I would like to reiterate what Gilley and a couple posters mentioned. Woulda been nice if the -b 4096 switch was edited into the first post. Also the Inappropriate ioctl for device was never discussed to conclusion.

...
This was also asked earlier and I didn't see it addressed so just figured I would bring it up again, upon initializing the badblocks test I get this message reported:
Code:
Testing with pattern 0xaa: set_o_direct: Inappropriate ioctl for device

As a new user, and shame on me for not reading all the way to the bottom of the thread before running the test, it would be nice to move some of these improvements such as the -b 4096 option into the main guide.

EDIT: found the answer to ioctl for device here:
https://forums.freenas.org/index.php?threads/badblocks-testing-inappropriate-ioctl-for-device.26015/
 
Last edited:

HardChargin

Dabbler
Joined
Jul 19, 2015
Messages
49
First, thank you qwertymodo and others for a great guide and your contributions. It's been extremely helpful and informative. To beat a dead horse, regarding setting;
Code:
sysctl kern.geom.debugflags=0x10

Can anyone tell me, what are the consequence (if any) of not setting it? Specifically on unformatted new disks (should badblocks have access to the entire disk in this case?). I ask because I ran badblocks which seemed to run fine, but I may have forgot to run that command beforehand. I'm hoping that didn't compromise the integrity of my drive burn in. Thanks in advance.
 
Last edited:

HardChargin

Dabbler
Joined
Jul 19, 2015
Messages
49

Thank you for the answer (and link), I appreciate it. Not the answer I was hoping for it seems but an answer none the less. Now, the question of running badblocks again, nearly a three day process...uggghh. I want to be thorough while I'm in the RMA window and able to get a new drive replacement.

One thing I'm still not clear on is, does the MBR/Boot Sector exist even on a brand new drive, before it's been partitioned, or has had an OS installed on it? In the link it seems to indicate that being the case. I assumed the OS created the MBR/Boot Sector on install, and only on the disk it's installed on. In my case I installed FreeNAS/FreeBSD on a separate boot device and all the drives I'm testing I hadn't done anything else with yet, short of list them and test them.

Any idea how badblocks handles the MBR when you don't run that command beforehand? Does it just skip it (no error/warning etc)?

Thanks again.
 
Last edited:

attilahooper

Dabbler
Joined
Jul 4, 2014
Messages
10
I feel your pain, 3 days to badblock 3TB greens. Although I didn't use -b 4096 switch.
Very good question(s), I would assume that the mbr/first sector would be skipped regardless. Question is, can badblocks be configured to test just the first sector ?
 

HardChargin

Dabbler
Joined
Jul 19, 2015
Messages
49
I feel your pain, 3 days to badblock 3TB greens. Although I didn't use -b 4096 switch.
Very good question(s), I would assume that the mbr/first sector would be skipped regardless. Question is, can badblocks be configured to test just the first sector ?

I like your thinking, it got me thinking ;). I haven't found a direct way to only test specific blocks, but I may have found an indirect way by using the -i switch for reading an input file which will cause badblocks to skip the "known bad" blocks listed in the file. At this point, I have a couple POAs (Plans of Attack).

1. Let it ride, comfortable that I tested pretty much the entire disk(s), with the hopes that because the disks didn't have an OS installed, all sectors were tested.
2. Let my OCD get the best of me and run a partial pass of badblocks instead of the full 8 run rigmarole, and then rerun smart again. There probably wont be a better time to beat up my disks than right now :D. TBD. Thanks again for your help.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
If you want to test a couple of blocks, just run a couple of dds. For something small, you could probably even do:

dd if=/dev/random of=/dev/adawhatever/LBAswhatever

^
That's just pseudosyntax.
 

attilahooper

Dabbler
Joined
Jul 4, 2014
Messages
10
Eric has the right idea, and here's another link for a thorough learning experience
http://www.cyberciti.biz/faq/howto-copy-mbr/

My disk are presented as da0, da1,da3... Not ada0, I suppose because they are on an integrated lsi 2308 sas(X10SL7-F), not sata cntroller.
So, for sanity you could do
dd if=/dev/random of=/dev/da0 bs=512 count=1

Do that 4 times to each disk, rerun long smart test. That would satisfy OCD and patience :)
And what are the odds of doing a lengthy badblocks but there's an error in the first 512 bytes. 512bytes/4TB ?? Pretty lonely odds :)
 

HardChargin

Dabbler
Joined
Jul 19, 2015
Messages
49
@Ericloewe and @attilahooper Thanks a bunch. I havent messed with that tool yet so Ill have to read up on it. It sounds like with your suggestions I can meet in the middle, satisfy my limited patience and OCD, and not smoke another weekend waiting to start installing/configuring FreeNAS.

I was sort of thinking along the same lines about the odds of problems in the missed sectors, but I wasn't really sure what all got missed, and after you spend three days waiting/monitoring, and realize you may have blown it, it's one of those, "you gotta be kidding me" moments :confused:. @attilahooper Your commands should work perfect with my system, I have the same drive labels as you, except one that I had in the normal SATA (not SAS) port which was labeled ada instead of da. Again, thank you for your help, it's much appreciated.
 
Last edited:
Top