...
- Verify that the bsr's IP and drx's IP are set in the node's firewall policy allowlist. If they are not enforced for the IP and port used by the resource, do the following
Centos 6
Add what you want to set to the /etc/sysconfig/iptables file.
Code Block -A INPUT -p tcp -s \{source ip\} -d \{destination ip\} --dport \{Allowed Ports\} -j ACCEPT
Centos 7
Code Block firewall-cmd --permanent --zone=public --add-port=\{허용할 포트\}/tcp firewall-cmd --reload firewall-cmd --zone=public --list-all
- Check the loopback ping
- If there is a ping response to the loopback address (127.0.0.1) but no ping response to the local IP address, there is a problem with the configuration of your network environment. In this case, you should contact your network administrator.
...
Versions
Check for version compatibility.
- drbd
...
- 8.4.
...
- 8 or later
- drbd util
...
- 8.9
...
- .10 or later
- fsr 1.4 or later
- bsr 1.0 or later
- Verify that the
...
- local DRX and remote DRX have the same version
Code Block |
---|
[root@c65-3 build_files]# lsmod | grep drbd
drbd 374888 3
[root@c65-3 build_files]# |
...
fsr: 1.2 or later
...
Checking the DRX version
Ensure that the DRX version on the local node and the DRX version on the remote node are the same. Although DRX provides backward compatibility between versions, it is recommended that you configure with the same version of DRX whenever possible.
Check resource settings
...
Replication configuration
- Ensure that the resource configuration file is saved in ANSI
...
- or UTF8 format
...
- (we do not support UTF8 with BOM format
...
BSR Configuration
...
- ).
- If you made any changes to the hostname, make sure they are also applied to the configuration file.
- Verify that there are no duplicate communication ports in the configuration file.
- Verify with bsrsetup show that the ip loaded
...
- into the BSR is the same as the ip set in the resource file.
- Check whether wfc-timeout is set in the global entry. If not set, set the wfc-timeout value to 1.
- Add the value of ping-timeout to the "net" entry of the resource. The default value is 500ms, set it to 30 (3 seconds) to be generous.
DRX Configuration
DRX 간 연결
...
Check connections step by step
Connection between local and remote DRX
- Change all resources in the BSR to STANDALONE: bsradm disconnect r0
- drx를 설치하고 drxsvc를 start한 상태에서 drx간의 연결을 확인합니다.
- netstat 출력물에서 drx ip와 포트가 LISTEN/ESTABLISHE/TIME_WAIT인지 여부를 확인합니다.
- 정상적일 경우 리소스의 연결 상태는 bridged 상태 입니다.
- 이 때의 bsr 상태는 standalone 이며 drx 가 bsr과 연결하기 위한 상태는 connecting / waiting 상태로 전환됩니다.
- 양노드의 drx의 상태가 bridging이라면 drx간에 연결을 시도하는 상태이며 일정시간이 지나도 변화가 없다면 WAN 구간 상의 연결을 먼저 점검해 봐야 합니다.
- icmp ping 은 보통 방화벽 정책에 의해 차단되어 있을 가능성이 있기 때문에 drxsim등을 통한 로컬과 원격간의 TCP 연결 가능여부를 확인합니다.
- standalone이었던 bsr 리소스의 상태를 connecting상태로 변경한다. → bsradm connect 명령어로 상태를 변경 합니다.
- cat /proc/kmsg/의 로그에서 리소스의 상태가 Connecting으로 변경되는지 확인합니다.
- 정상 상황일 경우 bsr과 drx가 연결되면 established 로 연결이 성립됩니다.
- 만약 bsr의 status가 connecting이고 연결이 성립되지 않는다면 netstat 출력물에서 bsr ip가 LISTEN상태인지 확인합니다.
- local drx가 local bsr ip로 SYN_SENT를 시도하는지 확인합니다.
- TCP의 상태변경이 신속하게 바뀔 수 있기 때문에 netstat에 SYN_SENT 상태 출력이 파악되지 않을 수도 있습니다.
netstat의 결과를 다음과 같은 스크립트 형태로 지속적으로 모니터링 합니다.
$> while(true); do date; netstat -nap | grep 779 | sort -k 3; sleep 1; clear; done Thu Aug 23 08:51:23 PDT 2018 tcp 0 0 Install DRX and start the DRX service to connect both DRXs.Code Block - Check the connection status of the drx ip/port in netstat (the connection status is ESTABLISHED).
- If normal, the connection status of the resource is BRIDGED.
- At this point, the DRX will change to CONNECTING/WAITING state, trying to connect to the BSR, and the BSR is still STANDALONE.
- If the state of both drxes is still BRIDGING, then the drxes are attempting to connect and if there is no change after a period of time, you should check the connectivity on the WAN leg first.
- ICMP ping is likely blocked by firewall policies, so don't rely on ping to determine connectivity status. Use a network connectivity checker tool, such as drxsim included with drx, to check for TCP connectivity between local and remote.
- Change the BSR resource configuration to connect directly between the BSRs without involving DRX to see if it connects normally. If it connects normally, the problem is with the DRX connection.
- Change the state of a BSR resource from STANDALONE to CONNECTING (BSRADM CONNECT).
- In normal cases, the BSR and DRX will be connected as ESTABLISHED.
- If the status of the bsr is CONNECTING and the connection is not established, check the netstat output to see if the bsr ip is in LISTEN state.
- Verify that the local drx is attempting to SYN_SENT to the local bsr ip.
- Because TCP state changes can happen quickly, netstat may not catch the SYN_SENT state output.
- Continuously monitor the output of netstat in the form of the following script.
Info $>
while
(
true
);
do
date; netstat -nap | grep
779
| sort -k
3
; sleep
1
; clear; done
Thu Aug
23
08
:
51
:
23
PDT
2018
tcp
0
0
192.168
.
100.3
:
35814
ESTABLISHED - tcp 0 0192.168
.
100.3
:
7792
ESTABLISHED -
tcp
0
0
192.168
.
100.3
:
7791
LISTEN - tcp 0 00.0
.
0.0
:*
LISTEN -
tcp
0
0
192.168
.
100.3
:
7792
ESTABLISHED 8033/drx tcp 0 0192.168
.
100.3
:
35814
ESTABLISHED
8033
/drx
tcp
0
0
192.168
.
100.3
:
7793
ESTABLISHED 8033/drx tcp 0 0192.168
.
100.2
:
60676
ESTABLISHED
8033
/drx
tcp
0
0
192.168
.
100.3
:
7795
LISTEN 8033/drx tcp 0 00.0
.
0.0
:*
LISTEN
8033
/drx
tcp
0
0
192.168
.
100.3
:
7796
ESTABLISHED 8033/drx tcp 0 1192.168
.
100.2
:
43684
ESTABLISHED
8033
/drx
tcp
0
1
10.10
.
0.182
:
50460
SYN_SENT 8033/drx tcp 0 131.1
.
1.2
:
7793
SYN_SENT
8033
/drx
tcp
0
1
10.10
.
0.182
:
57966
SYN_SENT 8033/drx unix 3 [ ] STREAM CONNECTED 18779 2477/gconfd-2 unix 3 [ ] STREAM CONNECTED 20779 2512/gnome-panel31.1
.
1.2
:
7796
- bsr과 drx가 연결되면 netstat 출력물에서 리소스의 bsr ip와 drx의 ip가 eastablished 상태가 되는지 확인합니다.
- drx 로그에 실패(Ex. connection refuse)에 대한 로그가 있는지 확인합니다.
- cat /etc/sysconfig/network-scripts/ifcfg-* 명령어로 출력되는 결과물을 수집합니다.
- /var/log/messages
- service iptables status
- ip a 명령어로 출력되는 결과물
VIP 연결 불가
...
SYN_SENT
8033
/drx
unix
3
[ ] STREAM CONNECTED
18779
2477
/gconfd-
2
unix
3
[ ] STREAM CONNECTED
20779
2512
/gnome-panel
- Once the BSR and DRX are connected, verify that the resource's BSR IP and DRX's IP are in the EASTABLED state in the netstat output.
- Verify that there are no logs in the drx logs for failures (e.g. connection refuse).
- If you get to this stage, collect support files to get logs and have someone analyse them.
VIP unreachable
If socket binds are performed over the same VIP on both Active/Standby nodes using VIP, communication interference between the two nodes may occur. When interworking with VIP (SDR, MDR, etc.), the DRX of the standby node must be stopped.
When failing over to the standby node, the reverse is true: the DRX of the Active must be brought down (down) and the DRX of the Standby must be started (up) before the resources of the Active are started (up) to ensure a smooth connection.