Purpose

When a write operation to disk occurs, bsr writes to the local volume and simultaneously transfers a block of data over the network. These two actions occur simultaneously to perform replication. Sometimes, however, writing may be completed at one moment, but network transmission may not yet occur.

At this moment, if the Active node fails and becomes fail-over, the data blocks between the nodes will be out of sync. That is because the failed node wrote to disk, but it was switched over without replication being completed. Therefore, when the failed active node is restored, blocks written to the active node must be removed from the data set to be synchronized. Otherwise, the failed node will be at least once more writeable on the normal node, violating the principle of storage replication, "all or nothing" (all matches or give up). This issue is not limited to bsr, but is an issue with almost all storage replication solutions. Some storage solutions require a full full synchronization even after recovery if the active side fails.

bsr stores the activity log (AL) in the metadata area, allowing you to track the recently recorded blocks. These areas are called hot extents. If the active node suddenly shuts down and reboots and synchronizes after restarting the resource, it is only necessary to synchronize the corresponding hot extents in AL without performing full synchronization. This can drastically reduce the synchronization time for the active node that failed.

Active extents

The number of active extents can be set in the activity log. The data size that is retransmitted by resync after the primary failure is the size of all active extents multiplied by 4 MiB. This parameter should be set to a good compromise between the following conflicting characteristics:

Many active extents - Keeping a lot of active extents improves write throughput. Whenever a new extent is activated, the existing extent is reset to inactive state and this conversion is recorded in the metadata area. If the number of active extents is large, swapping with existing active extents takes less time, which reduces the write to metadata and improves performance.
Small active extents - Keeping the active extents small reduces the synchronization time for subsequent recovery in case of active node failure.

Performance should be primarily aimed at improving performance, but you should consider the increase in synchronization time.

Recommended AL size

Determining the number of extents should be based on the desired sync time for the sync speed. The number of active extents can be calculated as follows:

R - Specified sync speed. Unit is MB/s
t_sync - Target sync time. Unit is second (sec)
E - Number of Active extents

For example, assuming that the cluster maintains a target synchronization time of 60 MiByte/s (R), 4 minutes, or 240 seconds (tsync = 240) in the I/O subsystem with a processing speed of 200 MiByte/s. You can calculate as below.

3600 is an exact value, but the bsr hash function that implements AL works best when the number of extents is set to a prime number.

BSR User's Guide - eng

Activity Log

Analytics

Purpose

Active extents

Recommended AL size

Related content