HWRF  trunk@4391
sigsafety.py
1 """!Sets up signal handlers to ensure a clean exit.
2 
3 This module is a workaround for a deficiency of Python. When Python
4 receives a fatal signal other than SIGINT, it exits immediately
5 without freeing utilized resources or otherwise cleaning up. This
6 module causes Python to raise a fatal exception, that does NOT derive
7 from Exception, if a fatal signal is received. Note that there is a
8 critical flaw in this design: raising an exception in a signal handler
9 only raises it in the main (initial) thread. Other threads must call
10 the produtil.sigsafety.checksig function as frequently as possible to
11 check if a signal has been caught. That function will raise the
12 appropriate exception if a signal was caught, or return immediately
13 otherwise.
14 
15 The reason this HAD to be added to produtil is that the lack of proper
16 signal handling caused major problems. In particular, it completely
17 broke file locking on Lustre and Panasas. Both filesystems will
18 sometimes forget a file lock is released if the lock was held by a
19 process that exited abnormally. There were also unverified cases of
20 this happening with GPFS. Correctly handling SIGTERM, SIGQUIT, SIGHUP
21 and SIGINT has solved that problem thus far.
22 
23 The base class of any exception thrown due to a signal is CaughtSignal.
24 It has two subclasses: FatalSignal, which is raised when a fatal
25 signal is received, and HangupSignal. The HangupSignal is raised by
26 SIGHUP, unless the install_handlers requests otherwise. Scripts
27 should catch HangupSignal if the program is intended to ignore
28 hangups. However, nothing should ever catch FatalSignal. Only
29 __exit__ and finalize blocks should be run in that situation, and they
30 should run as quickly as possible.
31 
32 The install_handlers installs the signal handlers: term_handler and
33 optionally hup_handler. The raise_signals option specifies the list
34 of signals that will raise FatalSignal, defaulting to SIGTERM, SIGINT
35 and SIGQUIT. If SIGHUP is added to that list, then it will raise
36 FatalSignal as well. Otherwise, the ignore_hup option controls the
37 SIGHUP behavior: if True, SIGHUP is simply ignored, otherwise it
38 raises HangupSignal.
39 
40 One can call install_handlers directly, though it is recommended to
41 call produtil.setup.setup instead."""
42 
44 import signal
45 
46 ##@var defaultsigs
47 #Default signals for which to install terminal handlers.
48 defaultsigs=[signal.SIGTERM,signal.SIGINT,signal.SIGQUIT]
49 
50 ##@var modifiedsigs
51 #List of signals modified by install_handlers
52 modifiedsigs=list()
53 
54 ##@var __all__
55 # List of symbols exported by "from produtil.sigsafety import *"
56 __all__=['CaughtSignal','HangupSignal','FatalSignal','install_handlers','checksig']
57 
58 class CaughtSignal(KeyboardInterrupt):
59  """!Base class of the exceptions thrown when a signal is caught.
60  Note that this does not derive from Exception, to ensure it is not
61  caught accidentally. At present, it derives directly from
62  KeyboardInterrupt, though that may be changed in the future to
63  BaseException."""
64  def __init__(self,signum):
65  """!CaughtSignal constructor
66  @param signum the signal that was caught (an int)"""
67  BaseException.__init__(self)
68  self.signum=signum
69  ##@var signum
70  # The integer signal number.
71 
72  def __str__(self):
73  """! A string description of this error."""
74  return 'Caught signal %d'%(self.signum,)
76  """!With the default settings to install_handlers, this is raised
77  when a SIGHUP is caught. Note that this does not derive from
78  Exception."""
79 class FatalSignal(CaughtSignal):
80  """!Raised when a fatal signal is caught, as defined by the call to
81  install_handlers. Note that this does not derive from
82  Exception."""
83 
84 ##@var caught_signal
85 #The signal number of the signal that was caught or None if no
86 #signal has been caught. This is initialized by the signal handlers,
87 #and used by checksig to raise exceptions due to caught signals.
88 caught_signal=None
89 
90 ##@var caught_class
91 #The class that should be raised due to the caught signal, or None
92 #if no signal has been caught. This is initialized by the signal
93 #handlers, and used by checksig to raise exceptions due to caught
94 #signals.
95 caught_class=None
96 
97 def checksig():
98  """!This should be called frequently from worker threads to
99  determine if the main thread has received a signal. If a signal
100  was caught this function will raise the appropriate subclass of
101  CaughtSignal. Otherwise, it returns None."""
102  global caught_signal,caught_class
103  cs=caught_signal
104  cc=caught_class
105  if cs is not None and cc is not None:
106  raise cc(cs)
107  return None
108 
110  """!Resets all signal handlers to their system-default settings
111  (SIG_DFL). Does NOT restore the original handlers.
112 
113  This function is a workaround for a design flaw in Python
114  threading: you cannot kill a thread. This workaround restores
115  default signal handlers after a signal is caught, ensuring the
116  next signal will entirely terminate Python. Only the term_handler
117  calls this function, so repeated hangups will still be ignored if
118  the code desires it.
119 
120  Some may note you can kill a Python thread on Linux using a
121  private function but it is not available on all platforms and
122  breaks GC. Another common workaround in Python is to use
123  Thread.daemon, but that kills the thread immediately, preventing
124  the thread from killing external processes or cleaning up other
125  resources upon parent exit."""
126  for isig in modifiedsigs:
127  signal.signal(isig,signal.SIG_DFL)
128 
129 def hup_handler(signum,frame):
130  """!This is the signal handler for raising HangupSignal: it is used
131  only for SIGHUP, and only if that is not specified in
132  raise_signals and ignore_hup=False.
133  @param signum,frame signal information"""
134  global caught_signal,caught_class
135  caught_signal=signum
136  caught_class=HangupSignal
137 
139  raise HangupSignal(signum)
140 
141 def term_handler(signum,frame):
142  """!This is the signal handler for raising FatalSignal.
143  @param signum,frame signal information"""
144  global caught_signal,caught_class
145  caught_signal=signum
146  caught_class=FatalSignal
147 
148  produtil.locking.disable_locking() # forbid file locks
149  produtil.pipeline.kill_all() # kill all subprocesses
151  raise FatalSignal(signum)
152 
153 def install_handlers(ignore_hup=False,raise_signals=defaultsigs):
154  """!Installs signal handlers that will raise exceptions.
155 
156  @param ignore_hup If True, SIGHUP is ignored, else SIGHUP will
157  raise HangupSignal
158 
159  @param raise_signals - List of exceptions that will raise
160  FatalSignal. If SIGHUP is in this list, that overrides
161  any decision made through ignore_hup. """
162  global modifiedsigs
163  if(ignore_hup):
164  signal.signal(signal.SIGHUP,signal.SIG_IGN)
165  elif signal.SIGHUP not in raise_signals:
166  signal.signal(signal.SIGHUP,hup_handler)
167  for sig in raise_signals:
168  signal.signal(sig,term_handler)
169  modifiedsigs=list(raise_signals)
def disable_locking()
Entirely disables all locking in this module.
Definition: locking.py:36
With the default settings to install_handlers, this is raised when a SIGHUP is caught.
Definition: sigsafety.py:75
def install_handlers
Installs signal handlers that will raise exceptions.
Definition: sigsafety.py:153
Handles file locking using Python "with" blocks.
Definition: locking.py:1
def __str__(self)
A string description of this error.
Definition: sigsafety.py:72
def uninstall_handlers()
Resets all signal handlers to their system-default settings (SIG_DFL).
Definition: sigsafety.py:109
Raised when a fatal signal is caught, as defined by the call to install_handlers. ...
Definition: sigsafety.py:79
def checksig()
This should be called frequently from worker threads to determine if the main thread has received a s...
Definition: sigsafety.py:97
Base class of the exceptions thrown when a signal is caught.
Definition: sigsafety.py:58
def term_handler(signum, frame)
This is the signal handler for raising FatalSignal.
Definition: sigsafety.py:141
signum
The integer signal number.
Definition: sigsafety.py:68
Internal module that launches and monitors processes.
Definition: pipeline.py:1
def __init__(self, signum)
CaughtSignal constructor.
Definition: sigsafety.py:64
def kill_all()
Sends a TERM signal to all processes that this module is managing.
Definition: pipeline.py:322
def hup_handler(signum, frame)
This is the signal handler for raising HangupSignal: it is used only for SIGHUP, and only if that is ...
Definition: sigsafety.py:129