TY - GEN
T1 - Multi-objective Reinforcement Learning for Energy Harvesting Wireless Sensor Nodes
AU - Shresthamali, Shaswot
AU - Kondo, Masaaki
AU - Nakamura, Hiroshi
N1 - Funding Information:
ACKNOWLEDGMENT This work was partially supported by JST CREST Grant Number JPMJCR20F2 and JSPS KAKENHI Grant Number 18J20946.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Modern Energy Harvesting Wireless Sensor Nodes (EHWSNs) need to intelligently allocate their limited and unreliable energy budget among multiple tasks to ensure long-term uninterrupted operation. Traditional solutions are ill-equipped to deal with multiple objectives and execute a posteriori tradeoffs. We propose a general Multi-objective Reinforcement Learning (MORL) framework for Energy Neutral Operation (ENO) of EHWSNs. Our proposed framework consists of a novel Multi-objective Markov Decision Process (MOMDP) formulation and two novel MORL algorithms. Using our framework, EHWSNs can learn policies to maximize multiple task-objectives and perform dynamic runtime tradeoffs. The high computation and learning costs, usually associated with powerful MORL algorithms, can be avoided by using our comparatively less resource-intensive MORL algorithms. We evaluate our framework on a general single-task and dual-task EHWSN system model through simulations and show that our MORL algorithms can successfully tradeoff between multiple objectives at runtime.
AB - Modern Energy Harvesting Wireless Sensor Nodes (EHWSNs) need to intelligently allocate their limited and unreliable energy budget among multiple tasks to ensure long-term uninterrupted operation. Traditional solutions are ill-equipped to deal with multiple objectives and execute a posteriori tradeoffs. We propose a general Multi-objective Reinforcement Learning (MORL) framework for Energy Neutral Operation (ENO) of EHWSNs. Our proposed framework consists of a novel Multi-objective Markov Decision Process (MOMDP) formulation and two novel MORL algorithms. Using our framework, EHWSNs can learn policies to maximize multiple task-objectives and perform dynamic runtime tradeoffs. The high computation and learning costs, usually associated with powerful MORL algorithms, can be avoided by using our comparatively less resource-intensive MORL algorithms. We evaluate our framework on a general single-task and dual-task EHWSN system model through simulations and show that our MORL algorithms can successfully tradeoff between multiple objectives at runtime.
KW - DDPG
KW - Energy Harvesting Wireless Sensor Nodes
KW - Multi objective Reinforcement Learning
KW - Reinforcement Learning
UR - http://www.scopus.com/inward/record.url?scp=85126713810&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126713810&partnerID=8YFLogxK
U2 - 10.1109/MCSoC51149.2021.00022
DO - 10.1109/MCSoC51149.2021.00022
M3 - Conference contribution
AN - SCOPUS:85126713810
T3 - Proceedings - 2021 IEEE 14th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2021
SP - 98
EP - 105
BT - Proceedings - 2021 IEEE 14th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th IEEE International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2021
Y2 - 20 December 2021 through 23 December 2021
ER -