TY - GEN
T1 - Advantage Mapping
T2 - 10th Conference on Human-Agent Interaction, HAI 2022
AU - Hasegawa, Rintaro
AU - Fukuchi, Yosuke
AU - Okuoka, Kohei
AU - Imai, Michita
N1 - Funding Information:
This work was supported by JST CREST Grant number JPMJCR19A1, Japan.
Publisher Copyright:
© 2022 ACM.
PY - 2022/12/5
Y1 - 2022/12/5
N2 - When a user manipulates a system, a user input through an interface, or an operation, is converted to the user's intended action according to the mapping that links operations and actions, which we call "operation mapping". Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user's desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user's ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.
AB - When a user manipulates a system, a user input through an interface, or an operation, is converted to the user's intended action according to the mapping that links operations and actions, which we call "operation mapping". Although many operation mappings are created by designers assuming how a typical user would operate the system, the optimal operation mapping may vary from user to user. The designer cannot prepare in advance all possible operation mappings. One approach to solve this problem involves autonomous learning of an operation mapping during the operation. However, existing methods require manual preparation of scenes for learning mappings. We propose advantage mapping, which enables the efficient learning of operation mappings. Working from the idea that scenes in which the user's desired action is predictable are useful for learning operation mappings, advantage mapping extracts scenes according to the magnitude of entropy in the output of the action value function acquired from reinforcement learning. In our experiment, the user's ideal operation mapping was more accurately obtained from the scenes selected by advantage mapping than from learning through actual play.
KW - adaptive systems
KW - intelligent user interfaces
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85144603965&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144603965&partnerID=8YFLogxK
U2 - 10.1145/3527188.3561917
DO - 10.1145/3527188.3561917
M3 - Conference contribution
AN - SCOPUS:85144603965
T3 - HAI 2022 - Proceedings of the 10th Conference on Human-Agent Interaction
SP - 95
EP - 103
BT - HAI 2022 - Proceedings of the 10th Conference on Human-Agent Interaction
PB - Association for Computing Machinery, Inc
Y2 - 5 December 2022 through 8 December 2022
ER -