direct-io.hg

changeset 11507:3b045a00e703

[POWERPC][XEN] Detect bad spurious interrupt condition and panic instead of hang

When handing off the MPIC from Xen to Dom0, which is the current yet
not permamnet design, the MPIC can cause the processor to assert an
external interrupt when none is available. Rather then simply hang in
this condition we now panic so the user can see that there is indeed a
problem and identify it as this one.

This condition seems to be related to temperature and the probablity
of it occuring decreases if the machine is allowed to stay idle (not
in the Xen panic loop) for a minute or two.

Signed-off-by: Jimi Xenidis <jimix@watson.ibm.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
author Jimi Xenidis <jimix@watson.ibm.com>
date Tue Sep 12 06:47:22 2006 -0400 (2006-09-12)
parents 3bd92176890c
children 13e406c85c8b
files xen/arch/powerpc/external.c
line diff
     1.1 --- a/xen/arch/powerpc/external.c	Fri Sep 08 12:28:49 2006 -0500
     1.2 +++ b/xen/arch/powerpc/external.c	Tue Sep 12 06:47:22 2006 -0400
     1.3 @@ -75,6 +75,7 @@ void deliver_ee(struct cpu_user_regs *re
     1.4  void do_external(struct cpu_user_regs *regs)
     1.5  {
     1.6      int vec;
     1.7 +    static unsigned spur_count;
     1.8  
     1.9      BUG_ON(!(regs->msr & MSR_EE));
    1.10      BUG_ON(mfmsr() & MSR_EE);
    1.11 @@ -87,6 +88,14 @@ void do_external(struct cpu_user_regs *r
    1.12          do_IRQ(regs);
    1.13  
    1.14          BUG_ON(mfmsr() & MSR_EE);
    1.15 +        spur_count = 0;
    1.16 +    } else {
    1.17 +        ++spur_count;
    1.18 +        if (spur_count > 100)
    1.19 +            panic("Too many (%d) spurrious interrupts in a row\n"
    1.20 +                  "  Known problem, please halt and let machine idle/cool "
    1.21 +                  "  then reboot\n",
    1.22 +                  100);
    1.23      }
    1.24  }
    1.25