Environments


DRX requires enough free physical memory for buffering when installed locally on the production server. If there is not enough physical memory space, you should consider increasing the physical memory of the production server.

Also, if you use compression, you should consider that compression may cause CPU load on the production server. If the local I/O load is not heavy and the performance degradation of the production node due to compression is minimal, it is not a problem, but if the compression load affects the performance of the local system as a whole, you should reconsider using compression. Compression load can add about 20-30% of the local I/O load and can be offloaded by configuring the DRX performing compression as a dedicated machine and separating it from the production environment.

The operating policy of such a DRX should be established based on preliminary data on local I/O load, and the following items in the configuration environment should be investigated to determine the appropriate specifications and buffer operating policies for the DRX to run.

Specifications


Operating Systems

Supports Windows 2008 and later 64-bit, Linux CentOS 6.4 and later, and Ubuntu 12.04 LTS and later 64-bit platforms.

Minimum Specifications

  • Minimum 1 GHz or faster x86/x64 compatible processor (2 GHz or faster recommended), minimum 4 cores recommended
  • At least 4 GB of physical memory
  • Minimum 10 GB capacity disk

Recommended Memory Specifications

DRX's physical memory specifications are variable depending on the number of resources, maximum I/O, and transfer bandwidth. The following is a formula for DRX's physical memory specifications when the maximum I/O is greater than the transmission bandwidth.

  • Number of Resources * (Average I/O (MB/s) - Bandwidth (MB/s)) * I/O duration (seconds) + compression/encryption buffer (2GB)

If the peak I/O is always low compared to the bandwidth, calculate the size of the buffer per resource to be 1 GB to determine the physical memory specification.

  • Number of resources * 1 GB + compression/encryption buffer (2 GB)

WAN bandwidth is not really a guaranteed bandwidth, but rather a variable bandwidth that fluctuates based on network conditions. In general, WAN bandwidth is considered to be in the range of 10 to 100 Mbps, and estimating 1 MB/s to 10 MB/s is appropriate from a buffering perspective. Also, it is not advisable to disregard DRX buffering just because the maximum I/O measured within a period of time is lower than the replication bandwidth. Depending on the characteristics of the applications in the operating environment, there may be situations where the maximum I/O spikes at an unspecified time, so it is advisable to configure the DRX buffer with extra space for such situations.

The following is an example of determining the physical memory specification that DRX can buffer for 30 seconds of I/O at an average of 100 MB/s over a 100 Mbps WAN bandwidth.

I/O Load levelresource countSystem Memory BAB sizecompression / encryption bufferMemory RequirementsRecommended Memory Requirements

Normal speed Environment

(100MB/s)

12GB1 * (100MB-10MB) * 30초 = 2.7GB

max 2GB

6.7GB16GB
1010 * (100MB-10MB) * 30초 = 27GB31GB32GB
5050 * (100MB-10MB) * 30초 = 135GB139GB160GB
100100 * (100MB-10MB) * 30초 = 270GB274GB320GB

The following is an example of determining the physical memory specification assuming a WAN 100 Mbps bandwidth, a maximum I/O of 500 MB/s, and a maximum I/O duration of 30 seconds. In this example, the required memory for one replication resource operation is 20.7GB (32GB recommended).

I/O Load levelresource countSystem MemoryBAB sizecompression / encryption bufferMemory RequirementsRecommended Memory Requirements

High speed Environment

(500MB/s)

12GB1 * (500MB-10MB) * 30초 = 14.7GB

2GB


18.7GB32GB
1010 * (500MB-10MB) * 30초 = 147GB151GB160GB
5050 * (500MB-10MB) * 30초 = 735GB739GB800GB
100100 * (500MB-10MB) * 30초 = 1.47TB1.51TB1.6TB


Number of resources

You can configure as many resources as your memory resources allow. The maximum resource is 100 channels, but this can be flexible depending on available memory, and if the number of resources you need to operate is more than a few dozen channels, you may want to consider investing in multiple dedicated DRX nodes.

Investigate preliminary server I/O load

Use the following procedure to measure the I/O load of the production server.

  • Measure the read/write I/O data of the production server replication target disk (average I/O, maximum I/O over a period of at least 1-4 weeks).
  • Measurement method
    • Windows: Use Performance Monitor tool to collect disk I/O statistics data
    • Linux: Use utilities such as iostat to collect disk I/O statistics data
  • Consider buffer size, compression, and buffer operation policies based on the measurement results

Replication bandwidth

Replication bands require at least 10 Mbps to 100 Mbps bandwidth.

Policy

Determine your configuration based on the I/O load of your environment and whether you are using compression. Local configurations are common, but dedicated configurations are recommended when replication loads are high and acceleration across the WAN is required.

Buffer policy

  • Preliminary research on network bands, operating machine I/O load factors is required to instrument DRX's physical buffer specifications.
  • Pre-survey items
    • Average I/O per resource on the production machine
    • Maximum I/O volume
    • Maximum I/O duration
  • The average I/O and maximum I/O figures of the production machine are the basis for building an appropriate buffering environment.

case

buffer

remarks
1average I/O < maximum I/O < network bandwidth

Recommended buffer size: 1 GByte or more

Ex) 1Gbps bandwidth, 1G Buffer = Up to 100MB/s I/O can be maintained for about 10 seconds
2average I/O < network bandwidthmaximum I/O(Maximum I/O - bandwidth) * Maximum I/O duration

Ex) Average 50MB/s I/O, 100Mbps bandwidth, up to 200MB/s maximum I/O lasts for 10 seconds

(200MB/s - about 10MB/s) * 10 seconds = about 2GB

3network bandwidth < average I/O < maximum I/O

Consider the need for network bandwidth expansion and compression.


DRX's buffer should be set to the appropriate size to handle the I/O load of the production node based on the preliminary survey. If I/O data based on the preliminary survey is not available, the buffer size should be tuned after configuration and piloting based on the recommended buffer specifications in case 1.

If the I/O load on the production node is excessively large and the duration of peak I/O occurs over a long period of time (minutes to tens of minutes), DRX buffering may not be able to handle it. In this case, you should think about data compression.


BSR

Congestion policy

The congestion state means that buffering is impossible because there is no free space in the DRX buffer because the replication load is increased. In this case, DRX does not perform any special action and concentrates on remotely transmitting the replicated data in the buffer. The response to the congestion state is left to the congestion policy of DRBD.

The congestion policy is the corresponding policy in DRBD when the DRX's buffer enters the congestion state. Here's how to set up a congestion policy.

drbd.conf
resource r0 {
	proxy {
		memlimit 1G: # DRX buffer
	}
	net {
		on-congestion pull-ahead; # Congestion policy setting (Ahead mode)
		congestion-fill 950M; # Set congestion awareness point (when 950Mbyte data is buffered is congestion point)
	}
}

BSR has the following three congestion policies, and Ahead mode is recommended for asynchronous replication across the WAN.

  • block: Wait for I/O until the buffer is empty (until it can be queued into the buffer). This is the default when no congestion policy is set.
  • disconnect: Disconnect the replication connection and enter the StandAlone state.
  • pull-ahead: Enter lazy replication mode. In this case, the replication connection is maintained, but replication is stopped and local I/O is logged as out-of-sync, but resynchronization of the logged out-of-sync is performed when congestion is released.

Specifying buffer size

  • BSR coupling assumes an Ahead mode (lazy replication), asynchronous replication configuration.
  • Ensure that measurements can only be taken over measurement segments where the replication connection is maintained; I/O measurements over disconnected segments are not considered.
  • Aggregate the number of times BSR enters Ahead mode (congestion entry count) using the following methods
    • Aggregate the number of times the "Congestion-fill threshold reached" output in the BSR log.
    • Check the number of Ahead entries via the bsrsetup events2 command
  • Resize the buffer based on the collected congestion entry counts. If the congestion frequency is high, the size of the buffer should be further increased.
  • If the congested section does not become less frequent after increasing the buffer, consider compression.


FSR

FSR only needs to specify the size of DRX's buffer. It does not support congestion mode, so it does not need to be set. If the DRX buffer is full while buffering, it will immediately empty the buffer and automatically move to a synchronized state.

When specifying the size of the buffer, you only need to refer to BSR's buffer policy to set the criteria for buffer operation and consider sizing accordingly. You can specify buffers as large as your memory allows, typically at the multi-GB level.