FSR
FSR(File level sync and replication) is a file-level, host-based network-wired replication solution and real-time backup for critical enterprise data.
Users can easily specify replication targets at the file and directory level, and can create flexible replication environments with various configurations such as 1:1, 1:N, N:1 and shared replication. The replication deployment process enables live migration even in a pre-built environment without any downtime in terms of service operations. It also provides various functions such as automatic split brain detection that can occur during redundancy, backup of deleted data, compression/encryption, and snapshot.
Main Features
Synchronization
Before data replication is performed, the FSR copies the entire data from the source node to the target node to match the data of the source and the target. This process is called synchronization.
When the initial synchronization is performed, the entire data is synchronized. However, if resynchronization is necessary after the synchronization is completed once, the synchronization is efficiently performed by incremental synchronization only for the change of the source node data.
Resynchronization is performed when the replication network is re established, such as when the replication network is disconnected and reconnected, or when the node is rebooted. In other words, after initial synchronization, FSR automatically performs incremental synchronization whenever necessary.
Synchronization can also be performed manually by user intervention, such as synchronization commands and synchronization schedule settings.
Replication
Even when synchronization is complete or synchronization is in progress, the data on the source node can be changed in real time. This behavior of reflecting these changes in real time from the source node to the target node is called replication. Synchronization and replication play the same role in reflecting data from source to target, but FSR separates them.
Replication is divided into synchronous and asynchronous, depending on how replication is handled.
- Synchronous method guarantees target consistency by reflecting write I/O of one disk to disk of source and target simultaneously and completing. On the other hand, because the replication response performance of the target node affects local I/O latency, there is a performance constraint on deploying performance-critical services synchronously.
- The asynchronous method considers replication complete when disk write I/O is reflected locally and the replicated data is copied to the transfer buffer. This approach is ideal for building long-distance replication with no bandwidth constraints, while ensuring that some data in the replication buffering process could be lost when the fail-over occurred.
FSR natively supports asynchronous replication. Asynchronous replication performs internal buffering to minimize the impact of local I/O delays, so the size of the buffer used must be set appropriately for your operating environment. The buffer is provided as a memory buffer and a file buffer, and its size is determined based on the contents of the buffer configuration in Configurations.
Online File Verification
Online File Verification of FSR is a function to perform file-level hash summarization, list, and real-time comparison of duplicate file SET of source node and target node. If there is a difference in the results of the comparison, the FSR informs the user and can resolve the difference by resync.
The FSR does not need to verify the integrity of the source and target under normal operating conditions. Online File Verification is useful in the following situations:
When you need to resolve unintentional operating situations, such as data being manipulated or deleted if the target's files are not protected, and you need to compare source and target differences.
For more information, refer to Online Verification
Optmization
Replication is difficult to maintain target consistency when the transmission bandwidth is low depending on the operating environment. In situations where the replication environment is not physically backed up, FSR provides the ability to optimize the transmission load by compressing the transmitted data and simultaneously perform real-time encryption as needed.
It also provides the ability to efficiently use the public network band by arbitrarily limiting the replication transmission band according to the operating situation. For more information, refer to Etc.
Split brain detect
FSR automatically detects split brains. This feature is not available in other commercial file replication solutions and is a unique feature of the FSR solution. In other file replication solutions, the detection of split brain is left to the operation of the HA solution. However, this approach is dependent on HA operations, and in some situations can cause data loss due to reverse sink. To fundamentally avoid this problem, the replication solution must track the status of the file and automatically detect the split brain to prevent data loss.
For more information, refer to Section Split-Brain
Statistics
It provides statistical information that monitors the operation of replication and provides real-time performance monitoring. For more information, refer to Checking Status
Snapshot
In addition to performing real-time replication to the target node, it provides a snapshot function to back up data at a specific point in time to a separate space. For more information, refer to 5.5. Snapshot
HA/DR support
FSR provides CLI and Rest-API for HA / DR integration. Refer to System Manual, Appendix B. FSR Interface Guide.
Terms
Node
A generic term for devices connected to a network, of which computer nodes are called hosts. Normally, node and host tend to be used without distinction. In this manual, node is used with the same meaning without distinguishing from host.
Cluster
A cluster is a collection of computer nodes for special purposes. The cluster here is a replication cluster, which contains the source and target nodes that are configured to perform replication, and the FSR represents these replication clusters on a resource basis.
Resource
A resource refers to a replication resource and is a unit that provides replication services. A resource consists of a node, a connection, and a (set of replication) filesets, which can be represented by an FSR configuration file. FSR interprets the replication environment and settings through a configuration file and performs replication based on this.
Source(Primary)
Source data nodes on a replication cluster are called source nodes or sources. And the role in the replication cluster is Primary.
Terms such as primary, source, and active nodes are used differently depending on the replication or HA environment, but are not usually strictly distinguished.
Target(Secondary)
The node that receives the original or incremental data on the replication cluster and maintains a copy is called the target node or target. And the role in the replication cluster is Secondary.
The terms Secondary, Target, and Standby Node are used differently depending on the replication or HA environment, but are not usually strictly distinguished.
For replication, Primary is always the source and Secondary is the target, but for synchronization, the secondary can also be the source. This is because even if only secondary nodes exist in the replication cluster, they must be allowed to synchronize based on the latest secondary.
File Set
A unit within a resource that describes a replication target. A fileset is described as a file or directory to be replicated and contains an exclusion filter. An exclusion filter is a policy that allows you to exclude some files or directories from replication targets. It is based on regular expressions such as wildcards.
Consistency
Represents data integrity, which means that the data on the source and target is exactly the same. File replication ensures byte-level data consistency.
Split-Brain
A split brain is a condition or status in which two or more nodes have a primary role at any point in the replication cluster, potentially causing data loss. When a split brain occurs, the user can decide which node has the primary role to sacrifice and resolve the split brain to normalize the replication.
RID
FSR maintains and tracks a ULID-based unique number that represents the file state of a fileset to be replicated. This value is called the revision identifier (RID). The FSR uses the RID to determine the direction of synchronization and identify the split brain.
Topology
FSR can configure various connections between nodes depending on the replication configuration. Typically operating in a mesh or star topology.