Environments
DRX requires enough free physical memory for buffering when installed locally on the production server. If there is not enough physical memory space, you should consider increasing the physical memory of the production server.
Also, if you use compression, you should consider that compression may cause CPU load on the production server. If the local I/O load is not heavy and the performance degradation of the production node due to compression is minimal, it is not a problem, but if the compression load affects the performance of the local system as a whole, you should reconsider using compression. Compression load can add about 20-30% of the local I/O load and can be offloaded by configuring the DRX performing compression as a dedicated machine and separating it from the production environment.
The operating policy of such a DRX should be established based on preliminary data on local I/O load, and the following items in the configuration environment should be investigated to determine the appropriate specifications and buffer operating policies for the DRX to run.
Specifications
Operating Systems
Supports Windows 2008 and later 64-bit, Linux CentOS 6.4 and later, and Ubuntu 12.04 LTS and later 64-bit platforms.
Minimum Specifications
- Minimum 1 GHz or faster x86/x64 compatible processor (2 GHz or faster recommended), minimum 4 cores recommended
- At least 4 GB of physical memory
- Minimum 10 GB capacity disk
Recommended Memory Specifications
DRX's physical memory specifications are variable depending on the number of resources, maximum I/O, and transfer bandwidth. The following is a formula for DRX's physical memory specifications when the maximum I/O is greater than the transmission bandwidth.
- Number of Resources * (Average I/O (MB/s) - Bandwidth (MB/s)) * I/O duration (seconds) + compression/encryption buffer (2GB)
If the peak I/O is always low compared to the bandwidth, calculate the size of the buffer per resource to be 1 GB to determine the physical memory specification.
- Number of resources * 1 GB + compression/encryption buffer (2 GB)
WAN bandwidth is not really a guaranteed bandwidth, but rather a variable bandwidth that fluctuates based on network conditions. In general, WAN bandwidth is considered to be in the range of 10 to 100 Mbps, and estimating 1 MB/s to 10 MB/s is appropriate from a buffering perspective. Also, it is not advisable to disregard DRX buffering just because the maximum I/O measured within a period of time is lower than the replication bandwidth. Depending on the characteristics of the applications in the operating environment, there may be situations where the maximum I/O spikes at an unspecified time, so it is advisable to configure the DRX buffer with extra space for such situations.
The following is an example of determining the physical memory specification that DRX can buffer for 30 seconds of I/O at an average of 100 MB/s over a 100 Mbps WAN bandwidth.
I/O Load level | resource count | System Memory | BAB size | compression / encryption buffer | Memory Requirements | Recommended Memory Requirements |
---|---|---|---|---|---|---|
Normal speed Environment (100MB/s) | 1 | 2GB | 1 * (100MB-10MB) * 30초 = 2.7GB | max 2GB | 6.7GB | 16GB |
10 | 10 * (100MB-10MB) * 30초 = 27GB | 31GB | 32GB | |||
50 | 50 * (100MB-10MB) * 30초 = 135GB | 139GB | 160GB | |||
100 | 100 * (100MB-10MB) * 30초 = 270GB | 274GB | 320GB |
The following is an example of determining the physical memory specification assuming a WAN 100 Mbps bandwidth, a maximum I/O of 500 MB/s, and a maximum I/O duration of 30 seconds. In this example, the required memory for one replication resource operation is 20.7GB (32GB recommended).
I/O Load level | resource count | System Memory | BAB size | compression / encryption buffer | Memory Requirements | Recommended Memory Requirements |
---|---|---|---|---|---|---|
High speed Environment (500MB/s) | 1 | 2GB | 1 * (500MB-10MB) * 30초 = 14.7GB | 2GB | 18.7GB | 32GB |
10 | 10 * (500MB-10MB) * 30초 = 147GB | 151GB | 160GB | |||
50 | 50 * (500MB-10MB) * 30초 = 735GB | 739GB | 800GB | |||
100 | 100 * (500MB-10MB) * 30초 = 1.47TB | 1.51TB | 1.6TB |
Number of resources
You can configure as many resources as your memory resources allow. The maximum resource is 100 channels, but this can be flexible depending on available memory, and if the number of resources you need to operate is more than a few dozen channels, you may want to consider investing in multiple dedicated DRX nodes.
Investigate preliminary server I/O load
Use the following procedure to measure the I/O load of the production server.
- Measure the read/write I/O data of the production server replication target disk (average I/O, maximum I/O over a period of at least 1-4 weeks).
- Measurement method
- Windows: Use Performance Monitor tool to collect disk I/O statistics data
- Linux: Use utilities such as iostat to collect disk I/O statistics data
- Consider buffer size, compression, and buffer operation policies based on the measurement results
Replication bandwidth
Replication bands require at least 10 Mbps to 100 Mbps bandwidth.
Policy
Determine your configuration based on the I/O load of your environment and whether you are using compression. Local configurations are common, but dedicated configurations are recommended when replication loads are high and acceleration across the WAN is required.
Buffer policy
- Preliminary research on network bands, operating machine I/O load factors is required to instrument DRX's physical buffer specifications.
- Pre-survey items
- Average I/O per resource on the production machine
- Maximum I/O volume
- Maximum I/O duration
- The average I/O and maximum I/O figures of the production machine are the basis for building an appropriate buffering environment.
case | buffer | remarks | |
---|---|---|---|
1 | average I/O < maximum I/O < network bandwidth | Recommended buffer size: 1 GByte or more | Ex) 1Gbps bandwidth, 1G Buffer = Up to 100MB/s I/O can be maintained for about 10 seconds |
2 | average I/O < network bandwidth < maximum I/O | (Maximum I/O - bandwidth) * Maximum I/O duration | Ex) Average 50MB/s I/O, 100Mbps bandwidth, up to 200MB/s maximum I/O lasts for 10 seconds (200MB/s - about 10MB/s) * 10 seconds = about 2GB |
3 | network bandwidth < average I/O < maximum I/O | Consider the need for network bandwidth expansion and compression. |
DRX's buffer should be set to the appropriate size to handle the I/O load of the production node based on the preliminary survey. If I/O data based on the preliminary survey is not available, the buffer size should be tuned after configuration and piloting based on the recommended buffer specifications in case 1.
If the I/O load on the production node is excessively large and the duration of peak I/O occurs over a long period of time (minutes to tens of minutes), DRX buffering may not be able to handle it. In this case, you should think about data compression.
BSR
Congestion policy
The congestion state means that buffering is impossible because there is no free space in the DRX buffer because the replication load is increased. In this case, DRX does not perform any special action and concentrates on remotely transmitting the replicated data in the buffer. The response to the congestion state is left to the congestion policy of DRBD.
The congestion policy is the corresponding policy in DRBD when the DRX's buffer enters the congestion state. Here's how to set up a congestion policy.
resource r0 { proxy { memlimit 1G: # DRX buffer } net { on-congestion pull-ahead; # Congestion policy setting (Ahead mode) congestion-fill 950M; # Set congestion awareness point (when 950Mbyte data is buffered is congestion point) } }
BSR has the following three congestion policies, and Ahead mode is recommended for asynchronous replication across the WAN.
- block: Wait for I/O until the buffer is empty (until it can be queued into the buffer). This is the default when no congestion policy is set.
- disconnect: Disconnect the replication connection and enter the StandAlone state.
- pull-ahead: Enter lazy replication mode. In this case, the replication connection is maintained, but replication is stopped and local I/O is logged as out-of-sync, but resynchronization of the logged out-of-sync is performed when congestion is released.
Specifying buffer size
- BSR coupling assumes an Ahead mode (lazy replication), asynchronous replication configuration.
- Ensure that measurements can only be taken over measurement segments where the replication connection is maintained; I/O measurements over disconnected segments are not considered.
- Aggregate the number of times BSR enters Ahead mode (congestion entry count) using the following methods
- Aggregate the number of times the "Congestion-fill threshold reached" output in the BSR log.
- Check the number of Ahead entries via the bsrsetup events2 command
- Resize the buffer based on the collected congestion entry counts. If the congestion frequency is high, the size of the buffer should be further increased.
- If the congested section does not become less frequent after increasing the buffer, consider compression.
FSR
FSR only needs to specify the size of DRX's buffer. It does not support congestion mode, so it does not need to be set. If the DRX buffer is full while buffering, it will immediately empty the buffer and automatically move to a synchronized state.
When specifying the size of the buffer, you only need to refer to BSR's buffer policy to set the criteria for buffer operation and consider sizing accordingly. You can specify buffers as large as your memory allows, typically at the multi-GB level.