Markov decision processes with Kusuoka-type conditional risk mappings

This was part of Dynamic Assessment Indices

Ziteng Cheng, University of Toronto

Friday, May 13, 2022

Abstract: Under suitable conditions, the Kusuoka representation of law invariant coherent risk measures allows one to cast them in terms of average value-at-risk. Here, we introduce the notion of Kusuoka-type conditional risk-mappings and use it to define a dynamic risk measure. We use such dynamic risk measures to study infinite horizon Markov decision processes (MDPs) with random costs and random actions. Under mild assumptions, we derive a dynamic programming principle and prove the existence of an optimal policy. This contributes to the risk-aware MDP framework of Ruszczyński (2010). Furthermore, we provide a sufficient condition for when deterministic actions are optimal. We also propose a sample-based solver for MDPs with Kusuoka-type conditional risk mappings and finite state action spaces.