[sf-lug] Pointers for how to track down source of system hang

Mark K. Zanfardino mark at zanfardinoconsulting.com
Fri Feb 1 08:36:09 PST 2008


Rick, et. al,

It's reassuring that I am pursuing the right train of thought.  I've
started a list and will proceed to catalog this-and-that through my
regular work week.  Maybe something we show up in the details.

I appreciate the feedback from everyone.  I'm currently looking to see
if I might be able to leverage LKCD
(http://www.faqs.org/docs/Linux-HOWTO/Linux-Crash-HOWTO.html)
utilities.  It appears that ubuntu has in it's repos some of the utils
mentioned in the article:
 apt-cache search lkcd
 crash - kernel debugging utility, allowing gdb like syntax
 dumputils - simple configuration and dump recovery utilities for LKCD
 lcrash - debugger to analyze and debug LKCD kernel crash dumps
 lcrash-dev - development files required to analyze LKCD kernel crash dumps
 crash-whitepaper - Whitepaper for crash kernel debugging utility

Of course, as stated in the article and on here, if the issue is in fact
hardware and it does not generate a panic state then these utils will
not yield any results.  Only time will tell.

Cheers!

Mark
Rick Moen wrote:
> Quoting Mark K. Zanfardino (mark at zanfardinoconsulting.com):
>
> [Trying to track down mysterious system hangs, you've run your system
> with and without Compiz, and with and without some sound driver or other:]
>
>   
>> As an aside, it has at times remained very stable over as long as a week 
>> or more with both Compiz running and the sound driver installed, which 
>> makes me think it's going to be something else.  The key for me is to be 
>> sure I'm checking all relevant logs and that I have all possible logging 
>> enabled.
>>     
>
> Well, whether you'd find anything useful in logfiles depends on the root
> cause of your symptom.  _Some_ root causes, including most hardware
> glitches, wouldn't leave much, or might leave nothing, at the software
> level.   
>
> You've definitely looked at all the places (system logfiles) I'd check
> if I suspected some system-wide problem such as a major hardware glitch.
> Since that didn't help (yielded no information), a change of tactics may
> be in order.
>
> The really annoying bit is that diagnosis in such cases, although it
> always reaches a definitive conclusion if you persist and are careful
> about your assumptions and judgements, takes time away from, y'know,
> using the computer for doing something useful.  For example, in your
> shoes, I might be tempted to use one or two live-CD Linux distributions
> for a few days, deliberately pushing the hardware hard, e.g., chewing up
> lots of RAM, banging on the CPU, etc., and see if the symptom persists.
>
> If it doesn't, then it's something about the installed software.  If it
> does, then it's probably a hardware problem.
>
> It's helpful to maintain a list, either on a notepad or in your mind (if
> you can) of all the suspects that _could_ be causing your problem.  As
> you try various things, watch carefully to see if you've ruled any of
> them out.
>
> There are also a couple of ways to torture-test ("burn-in") your
> hardware in cases where you suspect a problem there, which I can post
> about separately.
>
>   
>> I'm still relatively new to Linux....
>>     
>
> Ah, that explains the relative sanity.  We can help with _that_ problem,
> at least.  ;->
>
>
> _______________________________________________
> sf-lug mailing list
> sf-lug at linuxmafia.com
> http://linuxmafia.com/mailman/listinfo/sf-lug
>
>
>   




More information about the sf-lug mailing list