Setting Up Advanced Replication Tasks
14 minute read.Last Modified 2023-11-30 10:15 EST
TrueNAS SCALE advanced replication allows users to create one-time or regularly scheduled snapshots of data stored in pools, datasets or zvols on their SCALE system as a way to back up stored data. When properly configured and scheduled, local or remote replication using the Advanced Replication Creation option takes take regular snapshots of storage pools or datasets and saves them in the destination location on the same or another system.
The Advanced Replication Creation option opens the Add Replication Task screen. This screen provides access to the same settings found in the repliation wizard but has more options to specify:
- Full file system replication
- Stream compression
- Replication speed
- Attempts to replicate data before the task fails
- Block size for data sent
- Log level verbosity
You can also:
- Change encrypted replication to allow an unencrypted dataset as the destination
- Create replication from scratch
- Include or exclude replication properties
- Replicate specific snapshots that match a defined creation time.
- Prevent the snapshot retention policy from removing source system snapshots that failed
With the implementation of rootless login and the admin user, setting up replication tasks as an admin user has a few differences than with setting up replication tasks when logged in as root. Setting up remote replication while logged in as the admin user requires selecting Use Sudo For ZFS Commands.
The first snapshot taken for a task creates a full file system snapshot, and all subsequent snapshots taken for that task are incremental to capture differences occurring between the full and subsequent incremental snapshots.
Scheduling options allow users to run replication tasks daily, weekly, monthly, or on a custom schedule. Users also have the option to run a scheduled job on demand.
Replication tasks require a periodic snapshot task. The earlier releases of SCALE required creating a periodic snapshot task before the replication task, but SCALE 22.12 and newer automatically creates the snapshot task when a scheduled replication task starts. To start a replication task using the Run Now option on the Replication Task widget or by selecting Run Once in the Replication Task Wizard, create a periodic snapshot task first.
Remote replication requires setting up an SSH connection in TrueNAS before creating a remote replication task.
This section provides a simple overview of setting up a replication task regardless of the type of replication, local or remote. It also covers the related steps you should take prior to configuring a replication task.
If using a TrueNAS SCALE Bluefin system on the early release (22.12.1) you must have the admin user correctly configured with:
- The Home Directory set to something other than /nonexistent
- The admin user in the builtin_admin group
- The admin user passwordless sudo permission enabled
Also verify the SSH service settings to make sure you have Root with Password, Log in as Admin with Password, and Allow Password Authentication selected to enable these capabilities.
Incorrect SSH service settings can impact the admin user ability to establish an SSH session during replication, and require you to obtain and paste a public SSH key into the admin user settings.
Set up the data storage for where you want to save replicated snapshots.
Make sure the admin user is correctly configured.
Create an SSH connection between the local SCALE system and the remote system for remote replication tasks. Local replication does not require an SSH connection. You can do this from either Credentials > Backup Credentials > SSH Connection and clicking Add or from the Replication Task Wizard using the Generate New option in the settings for the remote system.
Go to Data Protection > Replication Tasks and click Add to open the Replication Task Wizard where you specify the settings for the replication task.
Setting options change based on the source selections. Replicating to or from a local source does not requires an SSH connection.
This completes the general process for all replication tasks.
Configure your SSH connection before you begin configuring the replication task through the Add Replication Task screen. If you have an existing SSH connection with the remote system the option displays on the SSH Connection dropdown list.
Turn on SSH service. Go to System Settings > Services screen, verify the SSH service configuration, then enable it.
To access advanced replication settings, click Advanced Replication Creation at the bottom of the first screen of the Replication Task Wizard. The Add Replication Task configuration screen opens.
Before you begin configuring the replication task, first verify the destination dataset you want to use to store the replication snapshots is free of existing snapshots, or that snapshots with critical data are backed up before you create the task.
To create a replication task:
Create the destination dataset or storage location you want to use to store the replication snapshots. If using another TrueNAS SCALE system, create a dataset in one of your pools.
Verify the admin user home directory, auxiliary groups, and sudo setting on both the local and remote destination systems. Local replication does not require an SSH connection so this only applies to replication to another system.
If using a TrueNAS CORE system as the remote server, the remote user is always root.
If using a TrueNAS SCALE system on an earlier release like Angelfish, the remote user is always root.
If using an earlier TrueNAS SCALE Bluefin system (22.12.1) or you installed SCALE as the root user, then created the admin user after initial installation, you must verify the admin user is correctly configured.
a. Go to Credentials > Local User, click anywhere on the admin user row to expand it. Scroll down to the Home Directory setting. If set to /home/admin, select Create Home Directory, then Click Save.
If set to /nonexistent, first create a dataset to use for home directories, like /tank/homedirs. Enter this in the Home Directory field, make sure this is not read only.
b. Select the sudo permission level you want the admin user to have. If you select Allow all sudo commands with no password you do not need to make changes. If you select Allowed sudo commands with no password enter
/var/sbin/zfsin the Allowed sudo commands field.
c. Click Save.
Give the task a name and set the direction of the task. Unlike the wizard, the Name does not automatically populate with the source/destination task name after you set the source and destination for the task. Each task name must be unique, and we recommend you name it in a way that makes it easy to remember what the task is doing.
Select the direction of the task. Pull replicates data from a remote system to the local system. Push sends data from the local system to the remote.
Select the method of tranfer for this replication from the Transport dropdown list. Select LOCAL to replicate data to another location on the same system. Select SSH is the standard option for sending or receiving data from a remote system. Select the existing SSH Connection from the dropdown list. Select SSH+Netcat is available as a faster option for replications that take place within completely secure networks. SSH+Netcat requires defining netcat ports and addresses to use for the Netcat connection.
With SSH-based replications, select the SSH Connection to the remote system that sends or receives snapshots. To create a new connection to use for replication from a destination to this local system, select newpullssh.
Select Use Sudo for Zfs Commands to controls whether the user used for SSH/SSH+NETCAT replication has passwordless sudo enabled to execute zfs commands on the remote host. If not selected, you must enter
zfs allowon the remote system to to grant non-user permissions to perform ZFS tasks.
Specify the source and destination paths. Adding /name to the end of the path creates a new dataset in that location. Click the arrow to the left of each folder or dataset name to expand the options and browse to the dataset, then click on the dataset to populate the Source. Choose a preconfigured periodic snapshot task as the sorce of snapshots to replicate. Pulling snapshots from a remote source requires a valid SSH Connection before the file browser can show any directories.
A remote destination requires you to specify an SSH connection before you can enter or select the path. If the file browser shows a connection error after selecting the correct SSH Connection, you might need to log in to the remote system and configure it to allow SSH connections. Define how long to keep snapshots in the destination.
Remote sources require defining a snapshot naming schema to identify the snapshots to replicate. Local sources are replicated by snapshots that were generated from a periodic snapshot task and/or from a defined naming schema that matches manually created snapshots.DO NOT use zvols as remote destinations.
Select a previously configured periodic snapshot task for this replication task in Periodic Snapshot Tasks. The replication task selected must have the same values in Recursive and Exclude Child Datasets as the chosen periodic snapshot task. Selecting a periodic snapshot schedule removes the Schedule field.
If a periodic snapshot task does not exist, select Replicate Specfic Snapshots tp define specific snapshots from the periodic task to use for the replication. This displays the schedule options for the snapshot task. Enter the schedule. The only periodically generated snapshots included in the replication task are those that match your defined schedule.
Select the naming schema or regular exression option to use for this snapshot. A naming schema is a collection of strftime time and date strings and any identifiers that a user might have added to the snapshot name. For example, entering the naming schema
custom-%Y-%m-%d_%H-%Mfinds and replicates snapshots like
custom-2020-03-25_09-15. Enter multiple schemas by pressing Enter to separate each schema.
Set the replication schedule to use and define when the replication task runs. Leave Run Automatically selected to use the snapshot task specified and start the replication immediately after the related periodic snapshot task completes. Select Schedule to display scheduling options for this replication task and To automate the task according to its own schedule.
Selecting Schedule allows scheduling the replication to run at a separate time. Choose a time frame that gives the replication task enough time to finish and is during a time of day when network traffic for both source and destination systems is minimal. Use the custom scheduler (recommended) when you need to fine-tune an exact time or day for the replication.
Choosing a Presets option populates in the rest of the fields. To customize a schedule, enter crontab values for the
These fields accept standard cron values. The simplest option is to enter a single number in the field. The task runs when the time value matches that number. For example, entering 10 means that the job runs when the time is ten minutes past the hour.
An asterisk (
*) means match all values.
You can set specific time ranges by entering hyphenated number values. For example, entering 30-35 in the Minutes field sets the task to run at minutes 30, 31, 32, 33, 34, and 35.
You can also enter lists of values. Enter individual values separated by a comma (
,). For example, entering 1,14 in the Hours field means the task runs at 1:00 AM (0100) and 2:00 PM (1400).
A slash (
/) designates a step value. For example, entering
*in Days runs the task every day of the month. Entering
*/2runs it every other day.
Combining the above examples creates a schedule running a task each minute from 1:30-1:35 AM and 2:30-2:35 PM every other day.
TrueNAS has an option to select which Months the task runs. Leaving each month unset is the same as selecting every month.
The Days of Week schedules the task to run on specific days in addition to any listed days. For example, entering 1 in Days and setting Wed for Days of Week creates a schedule that starts a task on the first day of the month and every Wednesday of the month.
The Schedule Preview displays when the current settings mean the task runs.
Syntax Meaning Examples * Every item. * (minutes) = every minute of the hour.
* (days) = every day.
*/N Every Nth item. */15 (minutes) = every 15th minute of the hour.
*/3 (days) = every 3rd day.
*/3 (months) = every 3rd month.
Comma and hyphen/dash Each stated item (comma)
Each item in a range (hyphen/dash).
1,31 (minutes) = on the 1st and 31st minute of the hour.
1-3,31 (minutes) = on the 1st to 3rd minutes inclusive, and the 31st minute, of the hour.
mon-fri (days) = every Monday to Friday inclusive (every weekday).
mar,jun,sep,dec (months) = every March, June, September, December.
You can specify days of the month or days of the week.
TrueNAS lets users create flexible schedules using the available options. The table below has some examples:
Desired schedule Values to enter 3 times a day (at midnight, 08:00 and 16:00) months=*; days=*; hours=0/8 or 0,8,16; minutes=0
(Meaning: every day of every month, when hours=0/8/16 and minutes=0)
Every Monday/Wednesday/Friday, at 8.30 pm months=*; days=mon,wed,fri; hours=20; minutes=30 1st and 15th day of the month, during October to June, at 00:01 am months=oct-dec,jan-jun; days=1,15; hours=0; minutes=1 Every 15 minutes during the working week, which is 8am - 7pm (08:00 - 19:00) Monday to Friday Note that this requires two tasks to achieve:
(1) months=*; days=mon-fri; hours=8-18; minutes=*/15
(2) months=*; days=mon-fri; hours=19; minutes=0
We need the second scheduled item, to execute at 19:00, otherwise we would stop at 18:45. Another workaround would be to stop at 18:45 or 19:45 rather than 19:00.
Options for compressing data, adding a bandwidth limit, or other data stream customizations are available. Stream Compression options are only available when using SSH. Before enabling Compressed WRITE Records, verify that the destination system also supports compressed write records.
Allow Blocks Larger than 128KB is a one-way toggle. Replication tasks using large block replication only continue to work as long as this option remains enabled.
By default, the replication task uses snapshots to quickly transfer data to the receiving system. Selecting Full Filesystem Replication means the task completely replicates the chosen Source, including all dataset properties, snapshots, child datasets, and clones. When using this option, we recommended allocating additional time for the replication task to run.
Leave Full Filesystem Replication unselected and select Include Dataset Properties to include just the dataset properties in the snapshots to replicate. Leave this option unselected on an encrypted dataset to replicate the data to another unencrypted dataset.
Select Recursive to recursively replicate child dataset snapshots or exclude specific child datasets or properties from the replication.
Enter newly defined properties in Properties Override to replace existing dataset properties with the newly defined properties in the replicated files.
List any existing dataset properties to remove from the replicated files in Properties Exclude.
When a replication task is having difficulty completing, it is a good idea to select Save Pending Snapshots. This prevents the source TrueNAS from automatically deleting any snapshots that failed to replicate to the destination system.
By default, the destination dataset is set to be read-only after the replication completes. You can change the Destination Dataset Read-only Policy to only start replication when the destination is read-only (set to REQUIRE) or to disable it by setting it to IGNORE.
The Encryption option adds another layer of security to replicated data by encrypting the data before transfer and decrypting it on the destination system. Selecting Encryption adds the addtional setting options HEX key or PASSPHRASE. You can store the encryption key either in the TrueNAS system database or in a custom-defined location.
Synchronizing Destination Snapshots With Source destroys any snapshots in the destination that do not match the source snapshots. TrueNAS also does a full replication of the source snapshots as if the replication task never run, which can lead to excessive bandwidth consumption.
This can be a very destructive option. Make sure that any snapshots deleted from the destination are obsolete or otherwise backed up in a different location.
Defining the Snapshot Retention Policy is generally recommended to prevent cluttering the system with obsolete snapshots. Choosing Same as Source keeps the snapshots on the destination system for the same amount of time as the defined Snapshot Lifetime from the source system periodic snapshot task.
You can use Custom to define your own lifetime for snapshots on the destination system.
Selecting Only Replicate Snapshots Matching Schedule restricts the replication to only replicate those snapshots created at the same time as the replication schedule.