Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

If all of the heartbeat communication is disconnected, both nodes cannot exchange the mutual state. 
Service recovery and process are determined by whether the opposite node is declared as a failure, or simply in a state in which the communication paths between nodes are disconnected.

Split Brain

핫빗 네트워크의 단절이 클러스터 속성에 정의되어 있는 일정 시간 간격 이상의 시간차로 발생할 경우는 노드 장애 보다는 핫빗 네트워크 전체에 대한 불안정을 의심할 수 있습니다 If the interval of heartbeat network disconnections is greater than the limit set in the cluster attribute, you can suspect instability of the entire heartbeat network instead of node failures
Thus, it is deemed that the heartbeat node status cannot be trusted and it does not detect system failures but maintains the current status. When heartbeat communication is restored, the nodes in the clusters will restart the MCCS service and return to the running status.
그렇지 않으면 INITING 상태에서 핫빗 통신이 정상화될 때까지 대기하게 됩니다If not, it will wait until heartbeat communication is restored in INITING state.

Isolation

Before declaring the opposite node as a failure even when all the heartbeats are disconnected within a certain time period, MCCS check the local node is disconnected from the whole network.
If the node can communicate to authorized network points such as gateway or DNS server, the local node is not disconnected, and it can be concluded that the opposite node is in a failure state and will not try to recover.
It that is not the case, it can be considered as isolation state.
Since standby node considers local node has a failure while local is trying to recover the service, service on local node needs to be terminated as soon as possible.

...