Overview

While fsr's file I/O filter driver operates at the kernel level, the file replication engine, which is responsible for transferring and synchronizing file replication data to the target, is implemented by the application. It is called Syncer.

Syncer is a cross-platform module that supports Windows and Linux simultaneously and is basically the core engine of fsr, which includes functions such as transferring replication data, file synchronization, and consistency checking.

Synchronization and consistency check

Both synchronization and consistency checking are performed by Syncer, which goes through the steps of browsing for synchronization (or consistency checking) targets and then performs either synchronization or consistency checking, depending on the mode.

operation mode	function flags	description
full-sync	sync	Full synchronization. Apply the contents of all files without hash comparisons
partial-sync	hash compare, sync	Partial synchronization. After comparing the hashes, apply the contents of the areas that differ
verify	hash compare	One-time matching verification. Used when not in a replication relationship or in a paused state.
advanced-verify	hash compare, wait replication sequence	Consistency verification during replication. Used to verify consistency between nodes during replication.

Explore phase

This is where you browse for the target directories and files that need to be synchronized (or checked for consistency).

Syncer creates a list of target files and compares the file lists and attributes of the local and remote locations. The comparison process loops through the local file list, comparing each file sequentially and determining which files are local only (missing files) or remote only (orphaned files).

After the detection phase, we move directly to the action phase.

The action phase

The first step in the action phase is to calculate the estimated time for synchronization (checking). It cycles through the list of differences generated in the Compare file lists step and estimates the maximum size of each file by summing them all together. However, if each file has different local/remote capacities, they are summed based on the larger capacity. The size of the file block is determined by the size of the file, with a minimum size of 128 KB and a maximum size of 16 MB, usually in 1 MB increments. After the calculation is complete, the file blocks are compared sequentially, downloading block-by-block data to the target side for application or matching differences in attributes.

If the behavior mode is matching, only the file blocks are compared at this point.

The advanced-verify mode of the matching check is an option that can be used when the matching check is performed while the resource is online and the data has changes (replication). In this mode of operation, the check waits for the replication block to complete before proceeding with the check for files that are being replicated at the same time as the matching check is being performed.

Synchronization with replication

Synchronization and replication must be able to be processed simultaneously, as there may be change I/O on source-side data even during synchronization. If replication data is received in a block section that is scheduled for synchronization, that section is excluded from the synchronization target section. If replication is received while the target is processing synchronization data, the information is queued and the data is processed together with local/remote hashes or data as it is received, so that access to the file block does not occur at the same time. If the replication data delivered earlier while waiting for synchronization data is written to the same area first, and the synchronization data is written later, the old data will eventually be written, which can be a problem, so we delay the replication to be written later. Since the replication data is always newer than the synchronization block, it makes sense to process the replication later.

Reconcilliation resync

As shown in the figure, in n-node replication, when the primary node is suddenly interrupted during real-time replication, such as a power failure, data inconsistency may occur between secondary nodes that received replication data. Even if the replication data is delivered in real time from the primary node, the data is delivered asynchronously to each node, so data inconsistency between nodes is inevitable. The problem is that the remaining nodes all have the data status UpToDate.

In this situation, FSR synchronizes based on the node with the most recent data among the surviving nodes to match the data on all surviving nodes in the cluster. This is called reconciliation resynchronization.

By bringing the data of the remaining nodes up to date, reconciliation resynchronization serves to increase the availability of the cluster until the nodes that went down first are brought back up.

FSR User's Guide - eng

Syncer