Peter Enderborg, an engineer from Sony, today proposed a new "soft watchdog" in one of his mailings in LKML. Watchdogs or watchdog timers or Computer Operating Properly (COP) timers are generally used in computers to detect unrecoverable errors and reset the system in such situations. Similarly, there are certain watchdogs that perform this kind of function in the case of an Out of memory (OOM) situation.
However, the new soft watchdog, according to Peter Enderborg, will not be performing a hard reboot and will instead take a "pre defined action" to try and kill any unimportant process causing such low memory situation. This could be performed using the "oom_score_adj" and the watchdog could kill such processes that have the highest oom_score_adj.
For those wondering, the "oom_score" is a score assigned by the Linux kernel to each running process where the higher number indicates higher memory usage by any process. The "oom_score_adj", however, helps the system to determine which process to kill such that no important process is killed off in case of an OOM scenario.
The oom_score_adj varies between -1000 and 1000 where a higher number means the process isn't as important for the system at that time and maybe terminated in an OOM situation.
This proposal is still Request for Comments (RFC) so it'll be interesting to see where it leads. You can find the LKML mailing related to the soft watchdog here.
10 Comments - Add comment