nightshade00013
Wizard
- Joined
- Apr 9, 2015
- Messages
- 1,258
By default badblocks runs four different patterns. If it passes that it should be fine to put into use.
[root@freenas] ~# smartctl -t long /dev/ada1 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in off-line mode". Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 694 minutes for test to complete. Test will complete after Thu Apr 7 23:26:28 2016 Use smartctl -X to abort test.
[root@freenas] ~# smartctl -A /dev/ada0 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 121 114 000 Old_age Always - 31 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
[root@freenas] ~# smartctl -A /dev/ada1 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 8 194 Temperature_Celsius 0x0022 120 110 000 Old_age Always - 32 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
[root@freenas] ~# smartctl -A /dev/ada2 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 118 108 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
[root@freenas] ~# smartctl -A /dev/ada3 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 120 110 000 Old_age Always - 32 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
Keep testing them, but so far so good.So after several days - here are my results. All WD Red 6TB's. Does this look good to go?
I think so but figured I would wait for the experts. All "when failed" items are -
ADA0
Code:[root@freenas] ~# smartctl -A /dev/ada0 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 121 114 000 Old_age Always - 31 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
ADA1
Code:[root@freenas] ~# smartctl -A /dev/ada1 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 8 194 Temperature_Celsius 0x0022 120 110 000 Old_age Always - 32 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
ADA2
Code:[root@freenas] ~# smartctl -A /dev/ada2 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 118 108 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
ADA3
Code:[root@freenas] ~# smartctl -A /dev/ada3 smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 4 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 164 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 9 194 Temperature_Celsius 0x0022 120 110 000 Old_age Always - 32 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
I've been working on a script to automate this burn-in process. The file is attached. Feedback is welcome. I don't recommend you run this on a production machine.
Built-in Safety Features:
Logging Features:
- The drive must exist
- The drive must not be in use (according to gpart)
- Will not start a second, simultaneous scan on the same drive
Convenience Features:
- All log files begin with the string "<DRIVE_MODEL>_<SERIAL_NUMBER>_<DATE>"
- SMART details are saved to disk before and after the testing process
- Log files generated to "at a glance" let you know which disks are in progress and which are completed.
Steps:
- Extensive automatic use of tmux results in "Fire and Forget" usability
- Enables and disables RAW Disk I/O as needed, being aware of other concurrent tests (works, kind-of, still in beta)
- tmux sessions named after drive name (da1, ada14, etc) so you know which drives are being tested
- Running the script against a new drive will list all active tmux sessions (equivalent of running 'tmux ls')
How to run it:
- Switch to bash
- Set save path
- Verify disk exists
- Get drive model number & serial number
- Verify disk not already in use (gpart)
- Verify disk not already under test
- Spawn tmux session. Name tmux session after disk device designation (ie. "da0")
- Create"In-Progress" status file
- Forcibly cancel previous SMART test
- Save SMART details to disk
- Start SMART short test. Write time stamp of completion to log file. Sleep until complete.
- Start SMART conveyance test. Write time stamp of completion to log file. Sleep until complete.
- Start SMART long test. Write time stamp of completion to log file. Sleep until complete.
- Enable RAW Disk I/O using sysctl
- Run destructive badblocks test (default settings)
- Start SMART long test. Write time stamp of completion to log file. Sleep until complete.
- Save SMART details to disk
- Remove "In-Progress" Status File
- Create "Completed" Status File
- If no other "In-Progress" files exist, then reset the RAW Disk IO flag to zero and remove the RAW Disk IO Flag file
- Done
During the writing of this post, I have 12 drives simultaneously running this very script on a non-production server.
- Give the server a drive to hold files. I mounted a USB drive and set it up as a standard volume with this path: /mnt/SystemDataset
- Put the script in /mnt/SystemDataset
- Open the script and edit the variable 'Save_Path' to fit your environment
- Set the script to be executable
- Run it like this:
- ./drive_burn_in.sh <drive_name>
- ./drive_burn_in.sh da1
/mnt/SystemDataset/drive_burn_in.sh: line 105: $Save_File: ambiguous redirect
Performing Conveyance Self-Test on HGST HGST S/N:PL2331LAH0VXJJ Sleeping until: /mnt/SystemDataset/drive_burn_in.sh: line 105: $Save_File: ambiguous redirect Failed conversion of ``'' using format ``%a %b %d %T %Z %Y'' date: illegal time format usage: date [-jnRu] [-d dst] [-r seconds] [-t west] [-v[+|-]val[ymwdHMS]] ... [-f fmt date | [[[[[cc]yy]mm]dd]HH]MM[.ss]] [+format] SMART conveyance test completed
HGST drives do not have a conveyance test so it seems. If you do a smartctl -a /dev/??? you will get a listing of what is actually supported.
For instance a WD green gets:
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
A HGST NAS drive gets:
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
Honestly a script really isn't much use when you are doing the smart tests since an arrow up a backspace and changing one number is all it takes to start the next test. Plus if you are doing the tests through SSH it's simple to copy and paste is just as easy when it comes to the actual badblocks. Shoot I did the last drives I got from about 100 miles away using my OpenVPN connection from my tablet and JuiceSSH. Unless you are doing a ton of drives a script is more likely to cause issues and you will always need to tailor it to your system/drives as you have already seen since the drives don't support a test and then the script fails.
Sorry, noob here.
I'm trying to follow the instructions, all works great until after I run the tmux command.
when running
"
freenas# badblocks -ws /dev/ada0
"
badlocks gives this message:
badblocks: Value too large to be stored in data type invalid end block (5860522584): must be 32-bit value
These are brand new drives, I didn't configure anything on them.
Thanks!
Did a quick search for you and it seems this might help: https://forums.freenas.org/index.php?threads/badblocks-test-for-wd60efrx.26708/
For large drives you must choose a larger block size. It's advantageous for all sizes, since the test is faster.Sorry, noob here.
I'm trying to follow the instructions, all works great until after I run the tmux command.
when running
"
freenas# badblocks -ws /dev/ada0
"
badlocks gives this message:
badblocks: Value too large to be stored in data type invalid end block (5860522584): must be 32-bit value
These are brand new drives, I didn't configure anything on them.
Thanks!
thanks Eric, ya I'll do that tomorrow when my SMART test completes again..
I forgot, do you guys recommend I run the badlocks simply through the web Interface (chrome browser) from another computer? or should I use a monitor on the freeNAS box and open up shell (option 9). thanks
tmux new-session -s ada0 -d tmux new-session -s ada1 -d tmux new-session -s ada2 -d tmux new-session -s ada3 -d tmux new-session -s ada4 -d
tmux ls
tmux attach -t ada0
top -zISmio
Close, but do yourself a favor and use only a single tmux session with multiple subwindows. tmux does stand for Terminal MUltipleXer. ;)Newb here as well, doing this for the first time. I open a putty session for each hard drive and run badblocks within a tmux terminal, that way I can monitor each one and if I lose connection they continue to run. Here's what I did:
Code:tmux new-session -s ada0 -d tmux new-session -s ada1 -d tmux new-session -s ada2 -d tmux new-session -s ada3 -d tmux new-session -s ada4 -d
Obviously replacing ada0-4 with your drive names. (And also do it more or less depending how many you have).
Then check if all of your TMUX sessions are running:
Code:tmux ls
Then in each putty window using SSH, I attached to each one:
Code:tmux attach -t ada0
Obviously replacing each drive name in each window.
Then in each window I run badblocks.
In the end, my desktop looks like this:
![]()
If you want to see the process information seen in the sixth window, use:
Code:top -zISmio
Close, but do yourself a favor and use only a single tmux session with multiple subwindows. tmux does stand for Terminal MUltipleXer. ;)
Check the manpage for the keyboard commands you need.