5 minute read.Last Modified 2023-11-30 10:15 EST
Hard drives and solid-state drives (SSDs) have a finite lifetime and can fail unexpectedly. When a disk fails in a Stripe (RAID0) pool, you must recreate the entire pool and restore all data backups. We always recommend creating non-stripe storage pools that have disk redundancy.
To prevent further redundancy loss or eventual data loss, always replace a failed disk as soon as possible! TrueNAS integrates new disks into a pool to restore it to full functionality.
TrueNAS requires you to replace a disk with another disk of the same or greater capacity as a failed disk. You must install the disk in the TrueNAS system. It should not be part of an existing storage pool. TrueNAS wipes the data on the replacement disk as part of the process.
Disk replacement automatically triggers a pool resilver.
If you configure your main SCALE Dashboard to include individual Pool or the Storage widgets they show the status of your system pools as on or offline, degraded, or in an error condition.
The Storage Dashboard pool widgets also show the status of each of your pools.
From the main Dashboard, you can click the on either the Pool or Storage widget to go to the Storage Dashboard screen, or you can click Storage on the main navigation menu to open the Storage Dashboard and locate the pool in the degraded state.
To replace a failed disk:
Locate the failed drive.
a. Go to the Storage Dashboard and click Manage Devices on the Topology widget for the degraded pool to open the Devices screen for that pool.
b. Click anywhere on the VDEV to expand it and look for the drive with the Offline status.
Take the disk offline.
Click Offline on the ZFS Info widget to take the disk offline. The button toggles to Online.
Pull the disk from your system and replace it with a disk of at least the same or greater capacity as the failed disk. V:
a. Click Replace on the Disk Info widget on the Devices screen for the disk you off-lined.
b. Select the new drive from the Member Disk dropdown list on the Replacing disk diskname dialog.
Add the new disk to the existing VDEV. Click Replace Disk to add the new disk to the VDEV and bring it online.
Disk replacement fails when the selected disk has partitions or data present. To destroy any data on the replacement disk and allow the replacement to continue, select the Force option.
When the disk wipe completes, TrueNAS starts replacing the failed disk. TrueNAS resilvers the pool during the replacement process. For pools with large amounts of data, this can take a long time. When the resilver process completes, the pool status returns to Online status on the Devices screen.
We recommend users off-line a disk before starting the physical disk replacement. Off-lining a disk removes the device from the pool and can prevent swap issues.
There are situations where you can leave a disk that has not completely failed online to provide additional redundancy during the replacement procedure.
We do not recommend leaving failed disks online unless you know the exact condition of the failing disk.
Attempting to replace a heavily degraded disk without off-lining it significantly slows down the replacement process.
If the off-line operation fails with a Disk offline failed - no valid replicas message, go to Storage Dashboard and click Scrub on the ZFS Health widget for the pool with the degraded disk. The Scrub Pool confirmation dialog opens. Select Confirm and then click Start Scrub.
When the scrub operation finishes, return to the Devices screen, click on the VDEV and then the disk, and try to off-line it again.
Click on Manage Devices to open the Devices screen, click anywhere on the VDEV to expand VDEV and show the drives in the VDEV.
Click Offline on the ZFS Info widget. A confirmation dialog displays. Click Confirm and then Offline. The system begins the process to take the disk offline. When complete, the disk displays the status of the failed disk as Offline. The button toggles to Online.
- You can physically remove the disk from the system when the disk status is Offline. If the replacement disk is not already physically installed in the system, do it now.
Use Replace to bring the new disk online in the same VDEV.
After a disk fails, the hot spare takes over. To restore the hot spare to waiting status after replacing the failed drive, remove the hot spare from the pool, then re-add it to the pool as a new hot spare.