Kernel Filter Driver
Overview
FSR implements file-level replication. An environment that replicates files is predicated on the premise that data on a volume is managed through files and paths formatted by a filesystem. FSR inserts a kernel filter module on top of the filesystem layer that examines all write I/O coming into the volume and identifies if it is a target for replication. The file path filtering logic within the kernel filter driver to identify replication targets is referred to here as a path filter. The path filter determines if the destination path of a write I/O coming into the FSR is a replication target and either buffers the I/O or bypasses it to a lower layer. If the write I/O is buffered, the driver forwards the I/O data to the FSR Syncer engine side via a shared memory area or file. It is then the responsibility of the Syncer engine to send the data to the target for delivery.
This section describes the path filter, shared buffer, etc. that are key elements in this process.
Path filters
A path filter is a component that allows you to list and search for replication target paths specified for a resource. Since the PathFilter is performed in the context of local, real-time I/O, the search time within the PathFilter has a significant impact on local I/O latency. Although the internal structure of the pathfilter is optimized based on a binary tree, it is inevitable that the search cost increases as the number of target paths to be listed increases. In addition, the number of resources to be replicated increases, which also affects the latency because the pathfilter search must be performed for each resource. Therefore, for local performance, it is recommended to minimize the target elements of the pathfilter in the replication target configuration step. Overly complex replication path settings are detrimental to performance.
File Stream Context
Standard file I/O includes the following operations: open, read, write, delete, lookup, set, and close. Each time an operation is performed, FSR needs to examine the I/O and take the appropriate action for each operation, which can be a significant load and bottleneck if it has to constantly look up the type of I/O and related information in the local I/O context. FSR does this by allocating contextual information per file stream, caching the file's path and other relevant information and referencing it whenever necessary.
Shared buffers
Memory buffers
File I/O data is first copied to a memory buffer for buffering. The memory buffer can be sized by the user and is typically set to a few hundred MB to a few GB. In normal operation, the replicated data in the memory buffer is sent directly to the target as soon as it is queued, so the buffer usage does not increase and remains constant. If there is insufficient bandwidth at the transport layer, or if there is a transmission delay due to a bottleneck, the rate at which replica data is dequeued from the memory buffer drops, causing the buffer usage to gradually increase and eventually overflow. When the FSR engine notices buffer congestion, it can no longer maintain the replication state and switches to the resynchronization state.
File buffers
If a filebuffer is configured in addition to the memory buffer, buffering is done with the file after the memory buffer overflows, meaning that buffering is done with a mix of memory and file buffers. Filebuffers have an operational advantage over memory buffers because they can be as large as disk space allows, allowing you to operate with relatively large buffers. However, because filebuffers have a larger delay (due to file I/O) than memory buffers, you must account for the degradation in performance of local I/O at the time filebuffering is performed. In general, filebuffering can result in twice the performance degradation of conventional I/O because it performs double I/O for the original I/O and buffering.