Signals as a Linux Debugging Tool

Eugenia Loli 2005-12-01 Linux 10 Comments

This is an interesting method of speeding up your debugging phase. This article gives a background on Linux signals with examples specifically tested on PPC Linux, then goes on to show how to design your handlers to output information that lets you quickly home in on failed portions of code.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

10 Comments

2005-12-01 7:24 pm
ma_d
Whoever finds all these IBM articles, thank you, they’re almost always good…

2005-12-01 8:33 pm
Anonymous
printf is not safe to call in a signal handler.

2005-12-01 8:25 pm
Anonymous
Whoever finds all these IBM articles, thank you, they’re almost always good…
Do I smell irony here?
2005-12-01 8:28 pm
Anonymous
Couldn’t all the info dumped when the process receives a signal in this article be garnered by examining the core file? I don’t see a point …

2005-12-02 1:44 am
james_parker
Couldn’t all the info dumped when the process receives a signal in this article be garnered by examining the core file?
Yes, if (a) a core file is immediately generated, and (b) a debugger is available on the target system.
In some cases, it is preferable to log errors and continue to move forward, rather than generate a core file. In others, such as at customer sites (for those who distributed their software) a debugger may not be available on the machine running the software. The code might also be running in an environment (embedded or quasi-embedded) where writing a complete core file is not reasonable, even if the process does immediately exit with a generated core file.
This technique might also be used to repair some errors or gather additional information about the process while it still runs (for example, a handler might log the point from which is was called and be used to determine if a program or thread is in an infinite loop).
It might also be used to determine the frequency of certain operations or requests without slowing the code under normal conditions (sending a signal every 10ms or so, for example, and let the handler log the location it was interrupted from). This would then add zero overhead to an executable under normal processing while still providing the functionality when needed.

2005-12-02 12:30 pm
Anonymous
Those are good points, but you have to add this debugging on a per application basis.
In the firmware environment I’ve run in, we log this type of information (registers and stack) in a very low level fashion from the kernel when bad things happen. Our philosophy is crash early, crash often.

2005-12-02 5:40 pm
Anonymous
Crash early, crash often….
Agreed. Being forced to fix it immediately is the only way to end up with reliable code.
2005-12-02 10:58 pm
james_parker
This could be added to a library rather than a to individual applications, but that does not eliminate the usefulness of the technique.
As for “crash early, crash often”, that is true in most development and test environments, but not necessarily in every environment. Post-deployment, frequent crashing is often not a reasonable alternative, and in some development and test environments, it is preferable to leave things running.
Not every technique works well in every situation, and it is useful for developers to have as wide a toolkit of techniques as possible to deal with such cases.

2005-12-02 2:06 am
Anonymous
many errors in this writeup. but you can also see these errors if you google for ‘printf signal handler’. many people giving the same bad, bad, bad advice.
you cannot call printf in a signal handler.
signal handlers are like interrupt handlers; they could happen at any time while the main code block is doing who knows what; even internally inside printf, malloc, stdio, who knows what.
there is a definate set of rules as to what you can do in a signal handler.
1) POSIX and SU define a list of safe functions. some other systems have additional functions which are safe; depending on how portable you want to be.
2) any global variable accessed from inside the signal handler should be of type ‘volatile sig_atomic_t’ to ensure that reads or writes to/from it happens safely and atomically without compiler-level caching. this matters even more on RISC processors.
for further explanations, please see the following manual page which makes an attempt to detail some of the issues:
http://www.openbsd.org/cgi-bin/man.cgi?query=signal&apropos=0&sekti…
(One interesting thing is that OpenBSD has a signal handler safe snprintf, which would allow you to actually replace many printf type problems with a snprintf + write combination).
it should scare us all when vendors publish articles which show they know so little about programming in the unix environment. but a google search will show how many other people get this wrong.
and please go read some actual code to see how bad the real situation is…

2005-12-02 10:52 pm
james_parker
This is generally true, although a version of printf could be written that may be called safely. (I’ve done that before to get past such issues in code I’ve inherited before cleaning it up later).
I took its use to be “generally illustrative” rather than detailed recommended practice (I suspect the author wasn’t considering the printf call to be part of the technique but used it as shorthand for saying “print information here”).
The author should have recognized this, though; and included a warning about this practice, or simply left the print example as a comment.