Causal bandits with propagating inference

Akihiro Yabe, Daisuke Hatano, Hanna Sumita, Shinji Ito, Naonori Kakimura, Takuro Fukunaga, Ken Ichi Kawarabayashi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)


Bandit is a framework for designing sequential experiments, where a learner selects an arm A ϵ A and obtains an observation corresponding to A in each experiment. Theoretically, the tight regret lower-bound for the general bandit is polynomial with respect to the number of arms |A|, and thus, to overcome this bound, the bandit problem with side-information is often considered. Recently, a bandit framework over a causal graph was introduced, where the structure of the causal graph is available as side-information and the arms are identified with interventions on the causal graph. Existing algorithms for causal bandit overcame the Ω(√\A\/T) simple-regret lower-bound; however, their algorithms work only when the interventions A are localized around a single node (i.e., an intervention propagates only to its neighbors). We then propose a novel causal bandit algorithm for an arbitrary set of interventions, which can propagate throughout the causal graph. We also show that it achieves O(√γ log(|A|T)/T) regret bound, where γ is determined by using a causal graph structure. In particular, if the maximum in-degree of the causal graph is a constant, then γ = O(N2), where N is the number of nodes.

Original languageEnglish
Title of host publication35th International Conference on Machine Learning, ICML 2018
EditorsJennifer Dy, Andreas Krause
PublisherInternational Machine Learning Society (IMLS)
Number of pages21
ISBN (Electronic)9781510867963
Publication statusPublished - 2018
Event35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duration: 2018 Jul 102018 Jul 15

Publication series

Name35th International Conference on Machine Learning, ICML 2018


Other35th International Conference on Machine Learning, ICML 2018

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software


Dive into the research topics of 'Causal bandits with propagating inference'. Together they form a unique fingerprint.

Cite this