5.3 Dependency Management
If you want to represent many H/W or S/W elements as resources, you need to define the order of operation for them.
For instance, the IP address does not mean anything without going through the physical network adapter phase.
Also, without running the DB engine service, you cannot run a DB connection client.
The logical time order as discussed here is defined as a hierarchy relationship between a parent and a child in MCCS.The parent resource is above the child resource.
When you start a group in MCCS, it will start going online from the bottom to the top.
The parent resource can go online only after all the child resources go online. When you terminate a group, it will start going offline from the top to the bottom.
Table of Contents
[Figure] Online/Offline Sequence of Dependent Resources When Starting or Terminating a Group
Dependency Configuration
To define this dependency among resources, configure this link in the ‘Resource Dependency’ view.
[Figure] Dependency relationship
In MCCS, you can configure the dependency relationship of resources by using a tree.
From the above screen, we can find out about the dependency relationship among resources.
The network address resource as a parent depends on the NIC resource as its child. The application resource as a parent depends on the network address resource as its child.
MCCS performs a failover on a group basis. Thus, all the resources within a group should be connected in this dependency relationship and they need to be put into a single group before we can provide meaningful service.
If we set two resources as a parent and an child, we can see it as two vertical levels. The max level can be checked in the 'MaxDependencyLevel' value in the group attribute menu.
Link
[Figure] Setting Link
The above screen shows how to set the link between the process resource (parent) and the network address resource (child).
The network address resource (child) depends on the process resource (parent).
- Select the group relationship editor screen.
- On the right palette screen, select 'Create Dependency'.
- You can see that the mouse cursor shape is changed to support linking.
- First, select the parent resource (process resource) and then select the child resource (network address resource) to finish link setting.
※ For linking process, first selected resource is parent resource and the one selected later is the child resource.
Dependency Removal
[Figure] Setting Unlink
- When you remove the dependency relationship, press 'Select' in the palette.
- Select a link to remove.
- The selected link will have big points at the both ends of resource to indicate the selection made.
- On the selected link, right click with your mouse button and select 'Dependency Removal' in the popup screen.
Basic Dependency Cases among resources
Following are the general cases of dependency.
Network Interface card(NIC) resource, Network Address Resource
Before assigning an IP address to the network address resource, you need to first have a physical network card. Without a network card, you cannot assign an IP address.
If you set the NIC resource as an child and the network address resource as a parent, it will check whether the network card is normal when the network address resources tries to go online.
Disk Resource, Network Address Resource, Application Resource (ex: DB)
Many applications record data in a disk or storage. Thus, applications must depend on disks in the dependency relationship.
If you use the DB, you need to designate the disk to record data in. Thus, for the DB, you must set the dependency relationship where the DB as a parent depends on the disk as a child.
If a client needs an IP address to access the DB, you must set the network address as an child and the DB as a parent.
Based on this dependency relationship, when a group goes online in MCCS, first check the network address resource and confirm whether the disk are unlocked properly before running the DB.
Thus the DB should be set to depend on both the network address resource and the disk resource.
We can provide the following configurations based on the dependency relationships discussed so far.
[Figure] Dependency of database application
Actions due to dependency
Online/Offline of the group is performed from bottom to top according to the dependency in case of online, and from top to bottom in case of offline.
Following are a few examples how MCCS manages the resources after dependency link among the resources when a resource is failed. Attributes and state of resources are defined as the figure below.
[Figure] Resource State Information Charts
Critical Attributes
- Failure occurs from a resource on which critical resource depend.
[Figure] Example of Failure Occur 1
- As a failure is occurred at resource r2, resource r1 will be taken offline. (Since resource r1 depends on resource r2 which is failed, resource r1 may cannot be online properly.)
- Since resource r1 has critical attribute, resource r3 and r4 will be taken offline in order to failover the group.
- As a result, resource r2 is considered as failure and all resources in the group will be taken offline
- Non-critical Resources
[Figure] Example of Failure Occur 2
- Resource r1 which is not critical is online status.
- When resource r2 get failed, resource r1 will be taken offline because it dependent on resource r2.
- As r1 resource is not critical, group failover will be not performed. Therefore, the group will be partial online status.
RestartLimit Attribute
RestartLimit is an attribute of the resource type. This value determines how many times the recovery will be attempted until it is finally confirmed that it is failure.
(Please refer to "6. Resource Type" for more details.)
[Figure] Example of Failure Occur 3
- On the assumption that the value of RestartLimit is 1 for resource r2, the first failure has occurred.
- MCCS will bring online resource r2 again as the value of RestartLimit.
At this time, resource r1 which depends on resource r2 will be taken offline. - resource r2 restart.
- resource r1 online.
- When another failure occurs on resource r2, all of the resource will be taken offline from resource r1 to r3 by order.
- Resource r3 will be taken offline.
- As a result, resource r2 is considered as failure and all resources in the group is offline status in the node.