Causal bandits with propagating inference

Akihiro Yabe, Daisuke Hatano, Hanna Sumita, Shinji Ito, Naonori Kakimura, Takuro Fukunaga, Ken Ichi Kawarabayashi

研究成果: Conference contribution

4 被引用数 (Scopus)


Bandit is a framework for designing sequential experiments, where a learner selects an arm A ϵ A and obtains an observation corresponding to A in each experiment. Theoretically, the tight regret lower-bound for the general bandit is polynomial with respect to the number of arms |A|, and thus, to overcome this bound, the bandit problem with side-information is often considered. Recently, a bandit framework over a causal graph was introduced, where the structure of the causal graph is available as side-information and the arms are identified with interventions on the causal graph. Existing algorithms for causal bandit overcame the Ω(√\A\/T) simple-regret lower-bound; however, their algorithms work only when the interventions A are localized around a single node (i.e., an intervention propagates only to its neighbors). We then propose a novel causal bandit algorithm for an arbitrary set of interventions, which can propagate throughout the causal graph. We also show that it achieves O(√γ log(|A|T)/T) regret bound, where γ is determined by using a causal graph structure. In particular, if the maximum in-degree of the causal graph is a constant, then γ = O(N2), where N is the number of nodes.

ホスト出版物のタイトル35th International Conference on Machine Learning, ICML 2018
編集者Jennifer Dy, Andreas Krause
出版社International Machine Learning Society (IMLS)
出版ステータスPublished - 2018
イベント35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
継続期間: 2018 7月 102018 7月 15


名前35th International Conference on Machine Learning, ICML 2018


Other35th International Conference on Machine Learning, ICML 2018

ASJC Scopus subject areas

  • 計算理論と計算数学
  • 人間とコンピュータの相互作用
  • ソフトウェア


「Causal bandits with propagating inference」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。