Table of Contents |
---|
The basic
BSR replicates in the following ways
Real-time replication at the same time that the application writes data to the block device.
Real-time replication does not affect other application services or system elements.
Replicate synchronously or asynchronously
The synchronous method treats replication as complete when the replication data has been written to the local disk and the target host's disk.
The asynchronous method completes replication when the replication data is written to the local disc and the data is transferred to the target host.
Kernel drivers
The core engine of BSR is implemented as a kernel driver.
The kernel driver sits at the disk volume layer and provides block-by-block control over write I/O from the filesystem. Because it sits at the lower layer of the filesystem, it provides a transparent replication environment that is independent of the filesystem and the application, making it ideal for building high availability. However, being at the lower layer of the filesystem means that it has no control over common operations on files. For example, it can't detect corruption in the filesystem or control the file data - it just replicates it block by block as it is written to disk.
BSR provides Active-Passive clustering by default, not Active-Active clustering.
Synchronization and Replication
To replicate, you must first match the data on both volumes. The process of copying data from the source to the target in disk blocks of a certain size is called synchronization. Replication is an operation that reflects changes in real time to the target side when data on the source side changes, and is distinguished from a synchronization action.
Synchronization and replication are separate operations, but can occur simultaneously. In other words, since replication can be processed simultaneously during synchronization (in the case of actual operating nodes, it should be considered that they are always processed almost simultaneously), it is important to properly control the bandwidth between them. For information on setting the sync band, see Adjust sync speed.
Replication works in the following way:
The application writes data to the block device while replicating it in real time.
Real-time replication does not affect other application services or system elements.
Replicate synchronously or asynchronously
In the synchronous method, replication is considered complete when the replication data has been written to the local disk and the target host's disk.
The asynchronous method treats replication as complete when replication data is written to the local disk and transmitted to the target host.
Kernel drivers
The core engine of BSR is implemented as a kernel driver.
The kernel driver sits at the disk volume layer and provides block-by-block control over write I/O from the filesystem. Because it sits at the lower layer of the filesystem, it provides a transparent replication environment that is independent of the filesystem and the application, making it ideal for building high availability. However, being at the lower layer of the filesystem means that it has no control over common operations on files. For example, it can't detect corruption in the filesystem or control the file data - it just replicates it block by block as it is written to disk.
BSR provides Active-Passive clustering by default, not Active-Active clustering.
Replication is the real-time reflection of all disk write operations from resources in the primary role to the secondary node, while resynchronization is the process of matching data from a block device perspective, excluding real-time write I/O. Replication and synchronization operate separately, but can also be processed in parallel.
Drawio | |||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Administration tools
BSR provides administrative tools for configuring and managing resources. It consists of bsradm, bsrsetup, bsrmeta, and bsrcon, which are described below. Administrator-level privileges are required to use the management commands.
bsradm
A utility that provides high-level commands that abstract from the detailed functionality of BSR. You can control most of the behaviour of BSR through bsradm.
bsradm gets all its configuration parameters from the configuration file etc\bsr.conf, and is responsible for passing commands to bsrsetup and bsrmeta with the appropriate options. This means that the actual behaviour is done by bsrsetup and bsrmeta.
bsradm can be run in dry-run mode with the -d option. This provides a way to see what combinations of options bsradm will run with, without actually invoking the bsrsetup and bsrmeta commands.
For more information about bsradm command options, see Appendix, bsradm in the Commands.
bsrsetup
Allows you to set the values required by the bsr kernel engine. All parameters to bsrsetup must be passed as text arguments.
The separation of bsradm and bsrsetup provides a flexible command scheme.
The parameters accepted by bsradm are replaced by more complex parameters to call bsrsetup.
bsradm prevents user mistakes by checking resource configuration files for grammatical errors, etc. bsrsetup does not check for these grammatical errors.
In most cases, you will not need to use bsrsetup directly, but use it when you need individual control between nodes or for special functions.
For more information about the bsrsetup command options, see Appendix, bsrsetup in the Commands.
bsrmeta
Provides the ability to create, dump, restore, and modify metadata for replication configurations. Like bsrsetup, most users do not need to use bsrmeta directly; they control metadata through commands provided by bsradm.
For more information about the bsrmeta command options, see Appendix, bsrmeta in the Commands.
bsrcon
View bsr-related information or adjust other necessary settings.
For more information about the bsrcon command options, see Appendix, bsrcon in the Commands.
Resource
A resource is an abstraction of everything you need to construct a replication dataset. You configure resources and control them to operate your replication environment.
To configure a resource, you must specify the following basic things: resource name, volume, and network connectivity.
Resource name
Specify a name in US-ASCII format without spaces.
Volume
A resource is a replication group consisting of one or more volumes that share a common replication stream, which ensures the consistency of all volumes within the resource.
A volume is described as a single device and is specified by a drive letter in Windows.
A replica set requires one volume for data replication and a separate volume to store metadata associated with the volume. The meta volume is used to store and manage internal information for replication.
Metadata is divided into external and internal meta types based on where it is stored. For example, if the metadata is located on the disk of the volume being replicated, it is internal meta; if it is located on another device or another disk, it is external meta.
External meta types have an advantage over internal meta in terms of performance because replication I/O and meta data writing can be performed simultaneously during operation, and the I/O performance of the meta disk directly affects replication performance, so it is recommended to configure it with a high-performance disk as much as possible.
The volume for the meta should not be formatted with a filesystem like NTFS and should be configured as RAW.
Network Connections (Connection)
A Connection is the communication link for a replica dataset between two hosts.
Each resource is defined as a multi-host with a full-mesh connection setup between multiple hosts.
The Connection Name is automatically assigned as the Resource Name at the bsradm level unless you specify otherwise.
Resource roles
A resource has a role of either Primary or Secondary.
Primary can perform unlimited read and write operations on the resource.
Secondary receives and records all changes to the disk from the other node and does not allow access to the volume. Therefore, applications cannot read or write to a Secondary volume.
The role of a resource can be changed through the bsr utility command. Changing the role of a resource from Secondary to Primary is called a promotion, and the opposite is called a demotion.
Main features
Replication clusters
BSR defines a set of nodes for replication as a replication cluster and supports single-primary mode by default, where only one node among the replication cluster members can act as a primary resource. It does not support multiple-primary mode. Single-primary mode, or the active-passive model, is the standard approach to handling data storage media in a highly available cluster for failover.
Replication methods
BSR supports three replication methods
Protocol A. Asynchronous
The asynchronous method considers replication complete when the primary node finishes writing to its local disk and simultaneously finishes writing to TCP's egress buffer. Therefore, in the event of a fail-over, data that has been written locally but is in the buffer may not fully pass to the standby node. After a failover, the data on the standby node is consistent, but some undelivered updates to writes that occurred during the failover may be lost. This method has good local I/O responsiveness and is suitable for long distant replication environments.
Protocol B. Semi-Synchronous
The semi-synchronous method considers replication to be complete when a local disk write occurs on the primary node and the replication packet is received by the other node.
While a forced fail-over typically does not result in data loss, the most recently written data on the Primary may be lost if both nodes lose power at the same time or if irreparable damage occurs on the Primary storage.
Protocol C. Synchronous
The synchronous method considers replication complete on the primary node when writes to both the local and remote disks are complete, thus ensuring that no data is lost in the event of a loss on either node.
Of course, if both nodes (or the nodes' storage subsystems) suffer irreversible damage at the same time, data loss is inevitable.
In general, BSR relies heavily on the Protocol C method.
The replication method should be determined by data consistency, local I/O latency performance, and throughput, which are factors that determine operational policy.
Info |
---|
Synchronous replication fully guarantees the consistency of production and standby nodes, but at the cost of performance degradation in terms of local I/O latency because it completes the local I/O after completing the write to the standby node for each write I/O. |
For an example of configuring replication mode, see Configuration examples.
Transport protocols
BSR's replication transport network supports the TCP/IP transport protocol.
TCP (IPv4/v6)
This is the default transport protocol for BSR and is a standard protocol that can be used on any system that supports IPv4/v6.
Efficient synchronization
In BSR, replication and (re)synchronization are distinct concepts.
|
Administration tools
BSR provides administrative tools for configuring and managing resources. It consists of bsradm, bsrsetup, bsrmeta, and bsrcon, which are described below. Administrator-level privileges are required to use the management commands.
bsradm
A utility that provides high-level commands that abstract from the detailed functionality of BSR. You can control most of the behaviour of BSR through bsradm.
bsradm gets all its configuration parameters from the configuration file etc\bsr.conf, and is responsible for passing commands to bsrsetup and bsrmeta with the appropriate options. This means that the actual behaviour is done by bsrsetup and bsrmeta.
bsradm can be run in dry-run mode with the -d option. This provides a way to see what combinations of options bsradm will run with, without actually invoking the bsrsetup and bsrmeta commands.
For more information about bsradm command options, see Appendix, bsradm in the Commands.
bsrsetup
Allows you to set the values required by the bsr kernel engine. All parameters to bsrsetup must be passed as text arguments.
The separation of bsradm and bsrsetup provides a flexible command scheme.
The parameters accepted by bsradm are replaced by more complex parameters to call bsrsetup.
bsradm prevents user mistakes by checking resource configuration files for grammatical errors, etc. bsrsetup does not check for these grammatical errors.
In most cases, you will not need to use bsrsetup directly, but use it when you need individual control between nodes or for special functions.
For more information about the bsrsetup command options, see Appendix, bsrsetup in the Commands.
bsrmeta
Provides the ability to create, dump, restore, and modify metadata for replication configurations. Like bsrsetup, most users do not need to use bsrmeta directly; they control metadata through commands provided by bsradm.
For more information about the bsrmeta command options, see Appendix, bsrmeta in the Commands.
bsrcon
View bsr-related information or adjust other necessary settings.
For more information about the bsrcon command options, see Appendix, bsrcon in the Commands.
Resource
A resource is an abstraction of everything you need to construct a replication dataset. You configure resources and control them to operate your replication environment.
To configure a resource, you must specify the following basic things: resource name, volume, and network connectivity.
Resource name
Specify a name in US-ASCII format without spaces.
Volume
A resource is a replication group consisting of one or more volumes that share a common replication stream, which ensures the consistency of all volumes within the resource.
A volume is described as a single device and is specified by a drive letter in Windows.
A replica set requires one volume for data replication and a separate volume to store metadata associated with the volume. The meta volume is used to store and manage internal information for replication.
Metadata is divided into external and internal meta types based on where it is stored. For example, if the metadata is located on the disk of the volume being replicated, it is internal meta; if it is located on another device or another disk, it is external meta.
External meta types have an advantage over internal meta in terms of performance because replication I/O and meta data writing can be performed simultaneously during operation, and the I/O performance of the meta disk directly affects replication performance, so it is recommended to configure it with a high-performance disk as much as possible.
The volume for the meta should not be formatted with a filesystem like NTFS and should be configured as RAW.
Network Connections (Connection)
A Connection is the communication link for a replica dataset between two hosts.
Each resource is defined as a multi-host with a full-mesh connection setup between multiple hosts.
The Connection Name is automatically assigned as the Resource Name at the bsradm level unless you specify otherwise.
Resource roles
A resource has a role of either Primary or Secondary.
Primary can perform unlimited read and write operations on the resource.
Secondary receives and records all changes to the disk from the other node and does not allow access to the volume. Therefore, applications cannot read or write to a Secondary volume.
The role of a resource can be changed through the bsr utility command. Changing the role of a resource from Secondary to Primary is called a promotion, and the opposite is called a demotion.
Main features
Replication clusters
BSR defines a set of nodes for replication as a replication cluster and supports single-primary mode by default, where only one node among the replication cluster members can act as a primary resource. It does not support multiple-primary mode. Single-primary mode, or the active-passive model, is the standard approach to handling data storage media in a highly available cluster for failover.
Replication methods
BSR supports three replication methods
Protocol A. Asynchronous
The asynchronous method considers replication complete when the primary node finishes writing to its local disk and simultaneously finishes writing to TCP's egress buffer. Therefore, in the event of a fail-over, data that has been written locally but is in the buffer may not fully pass to the standby node. After a failover, the data on the standby node is consistent, but some undelivered updates to writes that occurred during the failover may be lost. This method has good local I/O responsiveness and is suitable for long distant replication environments.
Protocol B. Semi-Synchronous
The semi-synchronous method considers replication to be complete when a local disk write occurs on the primary node and the replication packet is received by the other node.
While a forced fail-over typically does not result in data loss, the most recently written data on the Primary may be lost if both nodes lose power at the same time or if irreparable damage occurs on the Primary storage.
Protocol C. Synchronous
The synchronous method considers replication complete on the primary node when writes to both the local and remote disks are complete, thus ensuring that no data is lost in the event of a loss on either node.
Of course, if both nodes (or the nodes' storage subsystems) suffer irreversible damage at the same time, data loss is inevitable.
In general, BSR relies heavily on the Protocol C method.
The replication method should be determined by data consistency, local I/O latency performance, and throughput, which are factors that determine operational policy.
Info |
---|
Synchronous replication fully guarantees the consistency of production and standby nodes, but at the cost of performance degradation in terms of local I/O latency because it completes the local I/O after completing the write to the standby node for each write I/O. |
For an example of configuring replication mode, see Configuration examples.
Transport protocols
BSR's replication transport network supports the TCP/IP transport protocol.
TCP (IPv4/v6)
This is the default transport protocol for BSR and is a standard protocol that can be used on any system that supports IPv4/v6.
Efficient synchronization
As long as the replication connection between the primary and secondary is maintained, replication is performed continuously. However, if the replication connection is interrupted for any reason, such as a primary or secondary node failing, or the replication network being disconnected, synchronization between the primary and secondary is required.
...