Resource icon

Hard Drive Burn-In Testing - Discussion Thread

attilahooper

Dabbler
Joined
Jul 4, 2014
Messages
10
Thank you for your thanks. :) I'm newbi-ing myself in this new build. But I have read a lot in the past months preparing for my build. We are at about the same stage, I just finished badblocking #4 3TB drives, and had a power failure due to lightning strikes. Thankfully I had the wherewithall to have the superM on batteryBU. Yet I think it interrupted pass 4 of 2 drives in the middle of the night as battery chit the bed. But I'm not sweating it, latest smartctl long reports 0 errors in all tests.

I would think if your drives are clean in smart after BB passes, the chance of that first sector being bad is next to nothing, trillionths of a nothing.

dd is nothing special, dont be 'fraid , it just says write some chit. If your smarts all reporting 0 errors in pertinent categories, you are pretty much golden - the drive didnt' come with a manufacturing defect. That would stand out, as I have seen in the forum. The smartctl -a or -A for all my drives are big 0's in the important fields. Recent purchases off of amazon.

Move forward, build your volumes and shares, set up smart notices in email and enjoy. This is a great forum, kudos to freenas and community, I'll be paying creds soon enough.
 
Joined
Sep 13, 2014
Messages
149
I just wanted to say thank you to @qwertymodo for the guide.

Four days an three nights of having my server running in my bedroom was worth the disturbed sleep for the clean bill of health my disks have got. For any other noobs about to start the process, I highly recommend that you use something like PuTTY and learning and using tmux's shortcuts. I was having issues with FreeNAS' shell not opening whilst badblocks was running. I had no such issues with PuTTY.
 

qwertymodo

Contributor
Joined
Apr 7, 2014
Messages
144
Yeah, the WebGUI is fully synchronous, so it reeeeaally doesn't play nice with long blocking processes like badblocks. Use tmux.

Sent from my One M8 using Tapatalk
 

pratman2

Dabbler
Joined
Aug 12, 2015
Messages
28
No, smartctl is asynchronous, so you can just run one after another in a single shell, then you have to come back and run smartctl -a after the tests finish.

Just to be clear on the SMART tests. Are you saying that the tests are "asynchronous" across all of the drives in a system? So if I have 4 drives, I can queue up the tests in a single shell but the long test for example will not start on drive#2 until it is finished on drive#1? And if so, the long test will not start on drive#3 until it is finished on drive#2 etc.?

I ask because I have done this on 4 drives in a system, yet the estimated completion time for the long test is the same for all 4 drives. In my case about 7 hours. I would think if it were truly asynchronous it would be 4 x 7 hours which would be 28 hours total for the long test to complete on all 4 drives.

I would think that maybe the tests are "asynchronous" per drive, meaning as I type it into the shell the short test runs on drive#1, I can go ahead and punch in the conveyance test for drive#1 and it will not start until the short test is finished. I can also punch in the long test for drive#1 and it won't start until the conveyance test is finished.

Therefore if I have 4 drives in a system for example, I am basically "queuing" them up on each drive by typing each command into the shell one behind the other for each drive. All of the tests will run simultaneously "asynchronous" on the 4 drives

Having said that, I followed the tutorial in waiting for the test times to pass before starting the next test. I was just confused when I punched in the long tests for 4 drives and saw the test would complete at about the same time for each drive.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Just to be clear on the SMART tests. Are you saying that the tests are "asynchronous" across all of the drives in a system? So if I have 4 drives, I can queue up the tests in a single shell but the long test for example will not start on drive#2 until it is finished on drive#1? And if so, the long test will not start on drive#3 until it is finished on drive#2 etc.?

I ask because I have done this on 4 drives in a system, yet the estimated completion time for the long test is the same for all 4 drives. In my case about 7 hours. I would think if it were truly asynchronous it would be 4 x 7 hours which would be 28 hours total for the long test to complete on all 4 drives.

I would think that maybe the tests are "asynchronous" per drive, meaning as I type it into the shell the short test runs on drive#1, I can go ahead and punch in the conveyance test for drive#1 and it will not start until the short test is finished. I can also punch in the long test for drive#1 and it won't start until the conveyance test is finished.

Therefore if I have 4 drives in a system for example, I am basically "queuing" them up on each drive by typing each command into the shell one behind the other for each drive. All of the tests will run simultaneously "asynchronous" on the 4 drives

Having said that, I followed the tutorial in waiting for the test times to pass before starting the next test. I was just confused when I punched in the long tests for 4 drives and saw the test would complete at about the same time for each drive.
SMART tests are not run by the host. The host tells the drive to run the test, which it does internally - status can be queried via SMART.

Due to the current design of FreeNAS (and individual shell sessions in general), this request by the host is issued synchronously, one after the other. However, it is practically instantaneous. The drives themselves will be merrily running their tests for hours or more. It is asinine to only tell one drive to do the test after the previous one is finished. The test runtime estimate is just an estimate and typically overly optimistic. You'll know a test is done when the SMART test log says "completed".
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
the short test runs on drive#1, I can go ahead and punch in the conveyance test for drive#1 and it will not start until the short test is finished
More likely I think if you tell a drive to run a SMART test it will abort any in-progress SMART test. By the way, there's no reason to run the conveyance test more than once, unless you move your system to a new location.
 

pratman2

Dabbler
Joined
Aug 12, 2015
Messages
28
So it sounds like more than one drive can run the same test at once, or they each can run a different test at the same time.

And since entering the next test on a single drive before the first test is finished will abort the first, then the tests on a single drive are not necessarily "asynchronous" since one does not depend on the previous test to finish before it starts.

So the only thing "asynchronous" about the burn-in process is that multiple drives can run different tests at the same time?

I was unsure what qwertymodo meant by "smartctl is asynchronous, so you can just run one after another in a single shell"
 
Last edited:

pratman2

Dabbler
Joined
Aug 12, 2015
Messages
28
The output from smartctl will show the percentage complete of an in-progress test.
Thanks for the reply. Forgive me for being new to this, but as I asked above, how do I look at this? Is this simply typing in "smartctl -A /dev/adax to see the results? If so, I just get the SMART data off the disc, nothing that I can tell showing a status of the test.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
Indeed, when I have executed
smartctl -t short /dev/ada0
the command
smartctl --log=selftest /dev/ada0
kept showing nothing until 2 minutes later, i.e. after the short test completed...
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
smartctl -A /dev/adax
The correct command would be smartctl -a /dev/xxxx. When I do this with a test in progress, it shows the percentage complete of that test.
Code:
[root@poweredge] ~# smartctl -t short /dev/da0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p16 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Mon Aug 31 09:24:32 2015

Use smartctl -X to abort test.
[root@poweredge] ~# smartctl -a /dev/da0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p16 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD20EFRX-68EUZN0

Serial Number:   
LU WWN Device Id: 5 0014ee 26065a667
Firmware Version: 82.00A82
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Aug 31 09:22:42 2015 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)Offline data collection activity was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 249)Self-test routine in progress...
90% of test remaining.
 

pratman2

Dabbler
Joined
Aug 12, 2015
Messages
28
Started a badblocks in GUI, it has logged me out and of course cannot get logged back in.

3 of my 4 drives have finished the 4 pass run. The 4th one has completed 4 passes and is now on it's sixth read/write somehow.
Is there a way to stop it? What happens if I just restart the machine, can I still complete the burn in process?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I'd just wait for the process to end. You can use top to see and kill badlocks process but it's not a good idea.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Because you don't want to kill process that run normally, it's a bit like if you unplug your server to shutdown it. And here he will need to restart badblocks from the beginning.

Now I wonder if there's a way to see the output of the script... :rolleyes:
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
Because you don't want to kill process that run normally, it's a bit like if you unplug your server to shutdown it. And here he will need to restart badblocks from the beginning. [...]
OK, thank you. I was thinking past that, along the lines of possible damage to the drives. And... I have missed the original objective to have the badblocks complete...
 

pratman2

Dabbler
Joined
Aug 12, 2015
Messages
28
Started on write/read #7. Lesson learned, this will be the last time I use the GUI for burn-ins. I have Putty setup to run SSH from here on out.

I hope it is only running 1 more badblock test on this drive. I'm not sure how it decided to do what appears to be 2 full tests, which is 8 passes. Fingers crossed.
 
Top