Learning to Play Stackelberg Security Games
In this chapter, we present an alternative, learning-theoretic approach for dealing with uncertainty in Stackelberg security games. In order to paint a cohesive picture, we focus on one type of uncertainty: unknown attacker utilities. Learning will take place in a repeated Stackelberg security game, where the defender gathers information about the attacker purely by observing the attacker’s responses to mixed strategies played by the defender. In more detail, we wish to learn a good strategy for the defender without any initial information about the utility function of the attacker (Section 1); when given a distribution over attacker types (Section 2); and when faced with an unknown sequence of attackers (Section 3). In each section we present, in some generality, the relevant learning-theoretic techniques: optimization with membership queries, Monte Carlo tree search, and no-regret learning, respectively. In Section 4 we briefly discuss additional work at the intersection of machine learning and Stackelberg security games.