bsr uses Generation Identifiers (GI) to identify the generation of replication data. It is used in bsr internal mechanism for the following purposes.
When deciding whether two nodes are members of a same cluster (identification of accidental connections, not nodes in the same cluster)
When determining the direction of background resynchronization between nodes
When deciding whether full resynchronization is necessary or partial resynchronization
When identifying the split brain
Generation Identifiers
bsr considers that new data generation has started in the following cases and creates a new GI.
At initial full sync
When a disconnected resource is promoted to the primary role
When the primary role resource is disconnected
The time when a new GI is created is when write I/O occurs on the disk and the actual data is changed.
So, if the resource is in a Connected state and the disk status of both nodes is UpToDate, then the current GI of both nodes is the same. The opposite is also true. all new data generation is identified by an 8-byte universally unique identifier (UUID). The least significant bit of the UUID represents the role of the node, Primary is set to 1, Secondary is set to 0.
GI tuple
bsr basically manages current and past data generation information in local resource metadata as four (Tuple).
Current UUID - Generation identifier (GI) for the current data generation of the local node. If the resource is Connected and synchronized, this Current UUID is the same for both nodes.
Bitmap UUID - A UUID that tracks changes to the sync bitmap on disk. The sync bitmap identifier of the disk is only relevant in disconnected mode. If the resource is Connected, this UUID is always zero.
Two Historical UUIDs - These are the data generation identifiers that were in the previous state of the Current UUID, and remember two.
All four are collectively called "generation identifier tuple" or "GI tuple" for short.
GI change process
Start a new GI
When the connection between nodes is lost (including network error and manual operation)
In the connected state, the disk of the other node cannot reflect the replication data (peer-disk: Outdated)
bsr modifies the local GI in the following ways for the two cases mentioned above.
Create a new UUID to identify new data. This UUID becomes the new Current UUID for the Primary node.
The old UUID now means generation to track the bitmap change, so it will be the new Bitmap UUID for the Primary node.
In the Secondary node, there is no change in the GI tuple.
UUID is newly created when write I/O occurs to the volume. Therefore, the more accurate condition in which the GI is generated requires that the volume write I/O occurs in addition to the existing condition such as disconnection.
Begin resync
There is no change in the GI tuple when resynchronization begins.
End resync
When resynchronization is complete, the synchronization source rotates the bitmap uuid, and the synchronization target receives and reflects the entire set of GI tuples from the synchronization source.
GI status identification
When a connection is established between nodes, the two nodes exchange the currently available GI, and perform appropriate actions accordingly. There are several actions here.
Both node’s current UUIDs are empty
If both Current UUIDs are detected as empty, this is usually the case with new resource configuration and synchronization has never started. Synchronization does not proceed automatically here and you must start synchronization manually.the Current UUID of one node is empty
If the current UUID of the other node is empty and it detects that it is not, this is usually the case when a full synchronization was attempted immediately from the newly configured resource, and the local node was selected as the source of synchronization. bsr internally sets all used area bits of the on-disk sync bitmap, and then start syncing as source. In the opposite case (when the local UUID is empty and the opponent is not empty) bsr will go through the same procedure, but only the local node is synchronized to the target.When both node’s current UUIDs are the same
If your current UUID and your partner's UUID are not empty and are the same, this usually happens when you are disconnected in the secondary role and there is no promotion to any node during disconnected. Synchronization is not required and synchronization is not in progress.Bitmap UUID matches the current UUID of the other node
If Bitmap UUID is the same as peer's Current UUID and peer's Bitmap UUID is empty, then this is normal and a failure occurs when local are in the Secondary role and promoted to the Primary role. This means that the peer node has never been Primary and has been working with the same data generation until then. bsr sets the local node as the synchronization source and prepares it for background resynchronization. Conversely, if the local node's Bitmap UUID is empty, and the peer's Bitmap is equal to its Current UUID, this is also normal and occurred after the local node failed. bsr sets the local node as the synchronization target and prepares for background resynchronization.Current UUID matches the historical UUID of the peer node
If your Current UUID matches the Historical UUID of the other node, it means the following. While two data sets share a common previous generation content, it means that the other node has the latest data, but the Bitmap information cannot be used because it is from the past. Therefore, normal synchronization may not be enough. In this case, bsr marks the entire device out-of-sync and attempts a full background resynchronization by making the local node a synchronization target. Conversely, if there is a case in which the historical UUID of the local node matches the peer’s Current UUID, the local node becomes the synchronization source, and the same procedure is performed otherwise.Bitmap UUID matches, Current UUID does not match
If your Current UUID is different from peer's Current UUID and the Bitmap UUID matches, then a split-brain has occurred. Both nodes are left disconnected and the administrator must manually resolve the split brain situation unless auto-resolve is enabled.Neither current nor bitmap UUIDs match, but Historical UUIDs match The local node detects that its current UUID differs from the peer’s current UUID, and that the bitmap UUID’s do not match, but historical UUID match. This is split brain with unrelated ancestor generations, thus auto-recovery strategies, even if configured, are moot. DRBD disconnects and waits for manual split brain resolution.
If none of the UUIDs match
Finally, if there is no match in the GI tuples of the two nodes, it disconnects with a warning that it is irrelevant data. This is a safety measure in case you are connecting between nodes that are not related at all.