/
Appendix G. Agent

Appendix G. Agent

 

 

An agent is the internal MCCS program used to manage the resource.
MCCS uses an agent to command or monitor the resource. Depending on the resource type, an agent performs a proper function for each resource.
For instance, making a shared disk go online is to mount and unlock a disk, and allow write access.
But making a process go online is to run the program in the designated path.

 

 

Table Of Contents

 

 

Agent State Change

Agent State

Agent state can be checked from agent state value from resource state view of detailed information panel.

Agent State

State Description

Detached

It is in disabled state where agent thread hasn't started yet, therefore resource is not monitored yet.
In this case, Enabled value is false in the resource attribute.

Opening

It is preparing to monitor where resource is enabled. 
Enabled value has been changed to true in the resource attribute.

Probing

It is probing and verifying stage where the resource is ready to be used.

Online

Resource is in Online state and is being monitored.

Offline

Resource is in Offline state and is being monitored.
When resource is taken offline for failure, agent considers its state as Offline.

GoingOffline

Agent is taking offline a resource which it is online state.

GoingOfflineWait

It is a state command which agent takes online a resource was completed and resource is ready to offline state.

GoingOnline

Agent is bringing online a resource which it is offline state.

GoingOnlineWait

It is a state command which agent brings online a resource is completed and resource is ready to online.

Agent State Change

When a resource has been enabled, an agent will start monitor a resource after probing. 
After determining the state based on the results of the monitoring, agent gets the command and its state will be changed. 
Below is the flow of the change of agent state. 

[Figure] Cycle of Agent State

  1. When a resource is first added, it will be 'Disabled' state and its corresponding agent will be 'Detached' state.
  2. When a resource is 'Enabled', an agent starts monitoring this after process of ‘Opening’ and ‘Probing’.
  3. If a resource in online state is taken offline, an agent becomes 'GoingOffine' and when the command is done, it becomes 'GoingOfflineWait'.
  4. If monitoring result is Offline, agent state is Offline.
    If monitoring result is still Online, it repeats monitoring according to the set number interval it becomes Offline.
    If the time of offline command has exceeded the time value defined in 'OfflineTimeout' attribute, offline process will be canceled and a resource state will go back to online state.
  5. If a resource in offline state has been brought online, agent becomes ‘GoingOnline.’
  6. After 'Online' command is done and becomes 'GoingOnlineWait' monitoring begins and will repeat the monitoring according to value of 'OnlineWaitLimit' attribute in value of 'OfflineMonitorInterval' attributes until it comes online.
    If the monitoring result is that a resource in online state, agent is also online state, and if it does not become online in value of  'OnlineWaitLimit', it is considered as a failure. An agent state will be changed and cycled as above figure according to result of a resource state. But if a resource state is changed by external reasons not by operation through MCCS, this cycle wasn’t applied.
  7. If a resource state changes straight from Online to Offline without operation by MCCS, it is considered as a failure or abnormal state.
    MCCS goes through a change of sending a command and while monitoring, agent state changes as well according to the state. It will go through a state change of 'Going~' and this only applies to MCCS.
  8. For an example, if an online application is terminated due to an external error, it is considered as a failure.
    Also, in case that a resource is brought online on a standby node by force even though a group including this resource has been online on a active node in failover mode group, MCCS considers that this is abnormal state and take offline the resource on a standby node.
  9. If there is no change in the state as shown in the diagram above, it is considered as a failure or abnormal state.