We compare methods for estimating optimal dynamic decision rules from observational data, with particular focus on estimating the regret functions defined by Murphy (in J. R. Stat. Soc., Ser. B, Stat. Methodol. 65:331-355, 2003). We formulate a doubly robust version of the regret-regression approach of Almirall et al. (in Biometrics 66:131-139, 2010) and Henderson et al. (in Biometrics 66:1192-1201, 2010) and demonstrate that it is equivalent to a reduced form of Robins' efficient g-estimation procedure (Robins, in Proceedings of the Second Symposium on Biostatistics. Springer, New York, pp. 189-326, 2004). Simulation studies suggest that while the regret-regression approach is most efficient when there is no model misspecification, in the presence of misspecification the efficient g-estimation procedure is more robust. The g-estimation method can be difficult to apply in complex circumstances, however. We illustrate the ideas and methods through an application on control of blood clotting time for patients on long term anticoagulation.
Statistics in Biosciences, 2014, Vol 6, Issue 2, p. 244-260