TY - GEN
T1 - Can linux be rejuvenated without reboots?
AU - Yoshimura, Takeshi
AU - Yamada, Hiroshi
AU - Kono, Kenji
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96%) were process-local and only one error was kernel- global.
AB - Operating systems (OSes) are crucial for achieving high availability of computer systems. Even if the applications running on the operating system are highly available, a bug inside the kernel may result in a failure of the entire software stack. Rejuvenating OSes is a promising approach to prevent and recover from transient errors. Unfortunately, OS rejuvenation takes a lot of time because we do not have any method other than rebooting the entire OS. In this paper we explore the possibility of rejuvenating Linux without reboots. In our previous research, we investigated the scope of error propagation in Linux. The propagation scope is process-local if the error is confined in the process context that activated it. The scope is kernel-global if the error propagates to other processes' contexts or global data structures. If most errors are process- local, we can rejuvenate the Linux kernel without rebooting the entire kernel because the kernel goes back to a consistent and clean state simply by killing and revoking the resources of the faulting process. Our conclusion is that Linux can be rejuvenated without reboots with high probability. Linux is coded in a defensive way and thus, most of the manifested errors (96%) were process-local and only one error was kernel- global.
KW - Fault Injection
KW - Operating System Dependability
KW - Rejuvenation
KW - Scope of Error Propagation
KW - Software Faults
UR - http://www.scopus.com/inward/record.url?scp=84857174776&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857174776&partnerID=8YFLogxK
U2 - 10.1109/WoSAR.2011.12
DO - 10.1109/WoSAR.2011.12
M3 - Conference contribution
AN - SCOPUS:84857174776
SN - 9780769546162
T3 - Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011
SP - 50
EP - 55
BT - Proceedings - 2011 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011
T2 - 3rd International Workshop on Software Aging and Rejuvenation, WoSAR 2011
Y2 - 29 November 2011 through 1 December 2011
ER -