Patent attributes
In a server cluster, a system and method is provided for mitigating redundant resource failure notifications and other problems resulting from late handling of messages. Traditional resource management can result in the generation of redundant resource failure notifications that trigger unnecessary recovery actions, or cause other cluster problems such as performing an action that has previously been handled as part of failure recovery. The present invention tracks resource failures and eliminates recovery actions for redundant resource failure notifications. An incarnation number is passed to a resource each time it is called, and is incremented whenever a resource failure notification is delivered. Failure notifications having an incarnation number lower than the current incarnation number are discarded. Message processing similarly uses an incarnation number to distinguish between queued messages that correspond to those from a currently healthy node and those from a previous incarnation of the node, which no longer have meaning.