In an approach, a processor obtains a target base strategy for selecting actions of a target agent. A processor obtains an adversarial base strategy for selecting adversarial actions of an adversarial agent. A processor calculates, for each candidate action among a plurality of candidate actions of the target agent, a risk measure of the candidate action based on the adversarial base strategy and a payoff to the target agent in a case where the target agent takes the candidate action and the adversarial agent takes an adversarial action based on the adversarial base strategy. A processor generates a target strategy by adjusting the target base strategy based on the risk measure for each candidate action.