Academic Paper attributes
Human behaviors are regularized by a variety of norms or regulations, either to maintain orders or to enhance social welfare. If artificially intelligent (AI) agents make decisions on behalf of human beings, we would hope they can also follow established regulations while interacting with humans or other AI agents. However, it is possible that an AI agent can opt to disobey the regulations (being defective) for self-interests. In this paper, we aim to answer the following question: Consider a multi-agent decentralized environment. Agents make decisions in complete isolation of other agents. Each agent knows the state of its own MDP and its own actions but it does not know the states and the actions taken by other players. There is a set of regulations for all agents to follow. Although most agents are benign and will comply to regulations but not all agents are compliant at first, can we develop a framework such that it is in the self-interest of non-compliant agents to comply after all?. We first introduce the problem as Regulation Enforcement and formulate it using reinforcement learning and game theory under the scenario where agents make decisions in complete isolation of other agents. We then propose a solution based on the key idea that although we could not alter how defective agents choose to behave, we can, however, leverage the aggregated power of compliant agents to boycott the defective ones. We conducted simulated experiments on two scenarios: Replenishing Resource Management Dilemma and Diminishing Reward Shaping Enforcement, using deep multi-agent reinforcement learning algorithms. We further use empirical game-theoretic analysis to show that the method alters the resulting empirical payoff matrices in a way that promotes compliance (making mutual compliant a Nash Equilibrium).