Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

It is a guide to solving problems that may arise during

Table of Contents


Troubleshooting guides for issues that may arise in the process of configuring DRX.

Problem 1. An error occurred while installing DRX.

...


Install errors

  • Issues with installing the Visual C++ Redistributable Package for Visual Studio 2013 (hereafter VS2013 Redistributable Package) when installing DRX for Windows
    • Problem phenomenon 
      • Error installing "VS2013 Redistributable Redistribution Package" installation error that automatically adds after installing DRX installation
      • Cause: VS2013 Redistributable Package IssueAn inherent flaw in the VS2013 redistribution package.
    • Solution

      • Windows Server 2012 R2

        • ConditionDescription: "VS2013 Redistributable Redistribution Package" requires KB2883200 (Windows Update) for Windows Server 2012 R2.
        • Solution:  Make sure Ensure that Windows Update KB2883200 is installed. If it is not installed, install it through Windows Update.
      • Windows Server 2008 R2 SP1
        • ConditionExplanation: An error of Error 0x800b010a occurs.

          • Code Block
            [0AD8:05C0][2018-07-26T15:33:04]e000: Error 0x800b010a: Failed authenticode verification of payload: C:\ProgramData\Package Cache\.unverified\vcRuntimeMinimum_x64
            [0AD8:05C0][2018-07-26T15:33:04]e000: Error 0x800b010a: Failed to verify signature of payload: vcRuntimeMinimum_x64
            [0AD8:05C0][2018-07-26T15:33:04]e310: Failed to verify payload: vcRuntimeMinimum_x64 at path: C:\ProgramData\Package Cache\.unverified\vcRuntimeMinimum_x64, error: 0x800b010a. Deleting file.


        • Solution:  Update the Update Windows Update for additional ".NET Framework 3.5.1" entry in Windows Updateentries.

...

Unable to start a resource

  • Failed to read configuration files with settings due to UTF-8 with BOM .file format
    • ConditionProblem phenomenon

      • Failed to read drx.conf.

        Code Block
        titleDRX 로그
        E1120 16:37:02.690660 t42053 config] Failed to load [/opt/DRX/drx.conf]. /opt/DRX/drx.conf(1): '=' character not found in line


      • Failed to read drbd configurationsBSR settings

        Code Block
        titleDRX 로그
        E1120 16:37:52.810044 t42132 config] Failed to get drbd configuration: Can't get drbd configuration. (exit_code: 2560)
        E1120 16:37:52.810068 t42132 config] Output: drbd.d/1/r0.res:1: Parse error: 'global | common | resource | skip | include' expected,
        E1120 16:37:52.810070 t42132 config] Output: but got '▒'


      • Cause: BOM Configuration file parsing fails Failed to parse configuration file due to BOMbill of materials information.
    • Solution

      • Centos 6, 7
        • Check the file's encoding of the file with the file  commandcommand.

          Code Block
          [root@drxdev1 test]# file r1.res
          r1.res: UTF-8 Unicode (with BOM) text, with CRLF line terminators


        • Re-encoding via vi
          Open the file with vi, type the following, and save it.
          :set nobomb
      • Windows
        • Open the file with notepad and change the encoding to 'ANSI' via 'Save As'.

Problem 3. Can't connect between DRX resources.

...

Unable to connect

Because there are many possible reasons why a DRX connection has might not been be established. You should check these items in detail for the following sequence of configuration steps, you should follow the order of the replication connection configuration procedure and check it carefully. The following configuration sequence is a Linux-based troubleshooting guide and is equally applicable to Windows environments.

Network environments

...

based on Linux and is the same for Windows

Network environment

  1. Verify that the bsr's IP and drx's IP are set in the node's firewall policy allow list. If allowlist. If they are not enforced for the IP and port used by the resource has not been applied, take do the following actions.
    1. Centos 6

      Add the settings to 'what you want to set to the /etc/sysconfig/iptables ' file.

      Code Block
      -A INPUT -p tcp -s \{source IPip\} -d \{destination IPip\} --dport \{listenAllowed portPorts\} -j ACCEPT


    2. Centos 7

      Code Block
      Command to add port : firewallfirewall-cmd --permanent --zone=public --add-port=\{listen허용할 port포트\}/tcp 
      Command to restart firewall : firewall-cmd --reload
      Command to output opened ports : firewallfirewall-cmd --zone=public --list-all


  2. Ping Check the loopback addressping
    1. If there is a ping response with to the loopback address (127.0.0.1) , but no ping response with to the local ip IP address, there is a problem with the configuration of the your network environment. If In this is the case, you should contact your network administrator.

Versions

...

Check for version compatibility.

  • drbd

...

  • 8.4.8 or

...

  • later
  • drbd

...

  • util 8.9.10 or

...

  • later
  • fsr 1.4 or later
  • bsr 1.0 or later
  • Verify that the local DRX and remote DRX have the same version
Code Block
[root@c65-3 build_files]# lsmod | grep drbd
drbd                  374888  3 
[root@c65-3 build_files]# 

DRX version

Make sure that the DRX version of the local node is the same as the DRX version of the remote node. DRX provides backward compatibility between versions, but it is recommended that you configure DRX to the same version as possible.

Resource settings

...


Replication configuration

  • Ensure that the resource configuration file is

...

  • saved in ANSI

...

  • or UTF8 format

...

  • (we do not support UTF8 with BOM format

...

DRBD Configuration

...

  • ).
  • If you made any changes to the hostname, make sure they are also applied to the configuration file.
  • Verify that there are no duplicate communication ports in the configuration file.
  • Verify with bsrsetup show that the ip loaded into the BSR is the same as the ip set in the resource file.
  • Check

...

  • whether wfc-timeout

...

  • is set in the global entry. If

...

  • not set, set the wfc-timeout

...

  • value to 1.
  • Add the value of ping-timeout

...

  • to the "net

...

  • " entry of the resource. The default value is 500ms

...

  • , set it to 30 (3 seconds)

...

  • to be generous.

...


Check connections step by step


  1. Connection between local and remote DRX

    1. Change all of DRBD's resources to 'standalone': drbdadm resources in the BSR to STANDALONE: bsradm disconnect r0
    2. Install DRX and start drxsvc to check DRX connectivity.
    3. In netstat output, check whether DRX IP and port are LISTEN / ESTABLISHE / TIME_WAIT.
    4. If it is the DRX service to connect both DRXs.
    5. Check the connection status of the drx ip/port in netstat (the connection status is ESTABLISHED).
    6. If normal, the connection status of the resource is 'bridged'BRIDGED.
      1. At this time, DRBD status is 'standalone', and DRX is switched to 'connecting' / 'waiting' status to connect with DRBDpoint, the DRX will change to CONNECTING/WAITING state, trying to connect to the BSR, and the BSR is still STANDALONE.
    7. If the state of DRX of both nodes is 'bridging', it is a state to try to connect between DRX. If drxes is still BRIDGING, then the drxes are attempting to connect and if there is no change after a certain period of time, you should check connection the connectivity on the WAN section leg first.
      1. The icmp ICMP ping is usually likely blocked by firewall policy, so it checks for the possibility of a TCP connection policies, so don't rely on ping to determine connectivity status. Use a network connectivity checker tool, such as drxsim included with drx, to check for TCP connectivity between local and remote via drbdsim or other tools.
    Connections between DRBD and DRX
    1. Change the state of the DRBD resource from 'standalone' to 'connecting'. → Use drbdadm connect command to change the status.
      • Check that the status of the resource changes to 'WFConnection' in the log of cat /proc/kmsg/
      In normal situation, when DRBD and DRX are connected, it becomes 'established' stateBSR resource configuration to connect directly between the BSRs without involving DRX to see if it connects normally. If it connects normally, the problem is with the DRX connection.
  2. Connecting between BSR and DRX
    1. Change the state of a BSR resource from STANDALONE to CONNECTING (BSRADM CONNECT).
      1. In normal cases, the BSR and DRX will be connected as ESTABLISHED.
    2. If the status of the DRBD bsr is 'connecting' CONNECTING and the connection is not established, check the netstat output to see if the IP of the DRBD bsr ip is in the LISTEN state.
    3. Verify that the local DRX attempts drx is attempting to SYN_SENT with to the IP of the local DRBDbsr ip.
      1. You may not be able to identify Because TCP state changes can happen quickly, netstat may not catch the SYN_SENT in state output.
      2. Continuously monitor the output of netstat because in the status of TCP can change quickly.Let netstat monitor the results continuously through the form of the following script.
        code


        Info


        $>
         while(true);
         do
         date;
        netstat
        -nap
        |
        grep
         779
         |
        sort
        -k
         3;
        sleep
         1;
        clear;
        done
        Thu
        Aug
         23
         08:51:23
         PDT
        2018 tcp 0 0
         2018
        tcp        0      0 192.168.100.3:35814
                 192.168.100.3:7792
        ESTABLISHED - tcp 0 0
                  ESTABLISHED -                  
        tcp        0      0 192.168.100.3:7791
                  0.0.0.0:*
        LISTEN - tcp 0 0
                           LISTEN      -                  
        tcp        0      0 192.168.100.3:7792
                  192.168.100.3:35814
        ESTABLISHED 8033/drx tcp 0 0
                 ESTABLISHED 8033/drx           
        tcp        0      0 192.168.100.3:7793
                  192.168.100.2:60676
        ESTABLISHED 8033/drx tcp 0 0
                 ESTABLISHED 8033/drx           
        tcp        0      0 192.168.100.3:7795
                  0.0.0.0:*
        LISTEN 8033/drx tcp 0 0
                           LISTEN      8033/drx           
        tcp        0      0 192.168.100.3:7796
                  192.168.100.2:43684
        ESTABLISHED 8033/drx tcp 0 1
                 ESTABLISHED 8033/drx           
        tcp        0      1 10.10.0.182:50460
                   31.1.1.2:7793
        SYN_SENT 8033/drx tcp 0 1
                       SYN_SENT    8033/drx           
        tcp        0      1 10.10.0.182:57966
                   31.1.1.2:7796
        SYN_SENT 8033/drx unix 3 [ ] STREAM CONNECTED 18779 2477/gconfd-2 unix 3 [ ] STREAM CONNECTED 20779 2512/gnome-panel
    4. When DRBD and DRX are connected, the netstat output checks that the DRBD IP and DRX IP of the resource are in the 'established' state.
    5. Check if the log output from DRX contains a failure (Ex. Connection refuse).
  3. Collect logs (Collect the output from the command)
    1. cat /etc/sysconfig/network-scripts/ifcfg-*
    2. /var/log/messages
    3. service iptables status
    4. ip a

Problem 4. DRX connection does not work well when configured with Virtual IP.

...

      1.                SYN_SENT    8033/drx           
        unix  3      [ ]         STREAM     CONNECTED     18779  2477/gconfd-2      
        unix  3      [ ]         STREAM     CONNECTED     20779  2512/gnome-panel   



    1. Once the BSR and DRX are connected, verify that the resource's BSR IP and DRX's IP are in the EASTABLED state in the netstat output.
    2. Verify that there are no logs in the drx logs for failures (e.g. connection refuse).
  1. If you get to this stage, collect support files to get logs and have someone analyse them.


VIP unreachable

If socket binds are performed over the same VIP on both Active/Standby nodes using VIP, communication interference between the two nodes may occur. When interworking with VIP (SDR, MDR, etc.) smoothly, the DRX of the standby node must be stopped when DRBD resources are down, and DRX must be started when DRBD resources are up.

When failing over to the standby node, the reverse is true: the DRX of the Active must be brought down (down) and the DRX of the Standby must be started (up) before the resources of the Active are started (up) to ensure a smooth connection.