ia64/xen-unstable

changeset 16599:514d450ad729

Fix gdb debugging of hypervisor.

This patch:
* enables the gdbstubs to properly access hypervisor memory;
* prevents an assertion failure in __spurious_page_fault's call
to map_domain_page if such accesses fail, by testing in_irq();
* prints some additional helpful messages;
* fixes the endianness of register transfers from the gdbstubs
so that gdb is much less confused.
* fixes the documentation in docs/misc/crashdb.txt

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Wed Dec 12 11:27:15 2007 +0000 (2007-12-12)
parents f2f7c92bf1c1
children d54bcd738247
files docs/misc/crashdb.txt xen/arch/x86/gdbstub.c xen/arch/x86/traps.c xen/common/gdbstub.c xen/common/keyhandler.c xen/include/xen/gdbstub.h
line diff
     1.1 --- a/docs/misc/crashdb.txt	Wed Dec 12 11:08:21 2007 +0000
     1.2 +++ b/docs/misc/crashdb.txt	Wed Dec 12 11:27:15 2007 +0000
     1.3 @@ -5,31 +5,46 @@ Xen has a simple gdb stub for doing post
     1.4  you've crashed it, you get to poke around and find out why.  There's
     1.5  also a special key handler for making it crash, which is handy.
     1.6  
     1.7 -You need to have crash_debug=y set when compiling to enable the crash
     1.8 -debugger (so go ``export crash_debug=y; make'', or ``crash_debug=y
     1.9 -make'' or ``make crash_debug=y''), and you also need to enable it on
    1.10 -the Xen command line, by going e.g. cdb=com1.  If you need to have a
    1.11 -serial port shared between cdb and the console, try cdb=com1H.  CDB
    1.12 -will then set the high bit on every byte it sends, and only respond to
    1.13 -bytes with the high bit set.  Similarly for com2.
    1.14 +You need to have crash_debug=y set when compiling , and you also need
    1.15 +to enable it on the Xen command line, eg by gdb=com1.
    1.16  
    1.17 -The next step depends on your individual setup.  This is how to do
    1.18 -it for a normal test box in the SRG:
    1.19 +If you need to have a serial port shared between gdb and the console,
    1.20 +you can use gdb=com1H.  CDB will then set the high bit on every byte
    1.21 +it sends, and only respond to bytes with the high bit set.  Similarly
    1.22 +for com2.  If you do this you will need a demultiplexing program on
    1.23 +the debugging workstation, such as perhaps tools/misc/nsplitd.
    1.24 +
    1.25 +The next step depends on your individual setup.  This is how to do it
    1.26 +if you have a simple null modem connection between the test box and
    1.27 +the workstation, and aren't using a H/L split console:
    1.28  
    1.29 --- Make your test machine crash.  Either a normal panic or hitting
    1.30 -   'C-A C-A C-A %' on the serial console will do.
    1.31 --- Start gdb as ``gdb ./xen-syms''
    1.32 --- Go ``target remote serial.srg:12331'', where 12331 is the second port
    1.33 -   reported for that machine by xenuse. (In this case, the machine is
    1.34 -   bombjack)
    1.35 --- Go ``add-symbol-file vmlinux''
    1.36 --- Debug as if you had a core file
    1.37 --- When you're finished, go and reboot your test box.  Hitting 'R' on the
    1.38 -   serial console won't work.
    1.39 +  * Set debug=y in Config.mk
    1.40 +  * Set crash_debug=y in xen/Rules.mk
    1.41 +  * Make the changes in the attached patch, and build.
    1.42 +  * Arrange to pass gdb=com1 as a hypervisor command line argument
    1.43 +    (I already have com1=38400,8n1 console=com1,vga sync_console)
    1.44 +    
    1.45 +  * Boot the system with minicom (or your favourite terminal program)
    1.46 +    connected from your workstation via a null modem cable in the
    1.47 +    usual way.
    1.48 +  * In minicom, give the escape character (^A by default) three times
    1.49 +    to talk to Xen (Xen prints `(XEN) *** Serial input -> Xen...').
    1.50 +  * Press % and observe the messages
    1.51 +     (XEN) '%' pressed -> trapping into debugger
    1.52 +     (XEN) GDB connection activated.
    1.53 +     (XEN) Waiting for GDB to attach...
    1.54 +  * Disconnect from minicom without allowing minicom to send any
    1.55 +    modem control sequences.
    1.56 +  * Start gdb with   gdb /path/to/build/tree/xen/xen-syms  and then
    1.57 +      (gdb) set remotebaud 38400
    1.58 +      Remote debugging using /dev/ttyS0
    1.59 +      0xff124d61 in idle_loop () at domain.c:78
    1.60 +      78              safe_halt();
    1.61 +      (gdb)
    1.62  
    1.63 -At one stage, it was sometimes possible to resume after entering the
    1.64 -debugger from the serial console.  This seems to have rotted, however,
    1.65 -and I'm not terribly interested in putting it back.
    1.66 +There is code which was once intended to make it possible to resume
    1.67 +after entering the debugger.  However this does not presently work; it
    1.68 +has been nonfunctional for quite some time.
    1.69  
    1.70  As soon as you reach the debugger, we disable interrupts, the
    1.71  watchdog, and every other CPU, so the state of the world shouldn't
    1.72 @@ -44,7 +59,5 @@ Reasons why we might fail to reach the d
    1.73     you're screwed.
    1.74  -- If the page tables are wrong, you're screwed
    1.75  -- If the serial port setup is wrong, badness happens
    1.76 --- We acquire the console lock at one stage XXX this is unnecessary and
    1.77 -   stupid
    1.78  -- Obviously, the low level processor state can be screwed in any
    1.79     number of wonderful ways
     2.1 --- a/xen/arch/x86/gdbstub.c	Wed Dec 12 11:08:21 2007 +0000
     2.2 +++ b/xen/arch/x86/gdbstub.c	Wed Dec 12 11:27:15 2007 +0000
     2.3 @@ -71,18 +71,20 @@ gdb_arch_read_reg(unsigned long regnum, 
     2.4      gdb_send_reply("", ctx);
     2.5  }
     2.6  
     2.7 -/* Like copy_from_user, but safe to call with interrupts disabled.
     2.8 -   Trust me, and don't look behind the curtain. */
     2.9 +/*
    2.10 + * Use __copy_*_user to make us page-fault safe, but not otherwise restrict
    2.11 + * our access to the full virtual address space.
    2.12 + */
    2.13  unsigned int
    2.14  gdb_arch_copy_from_user(void *dest, const void *src, unsigned len)
    2.15  {
    2.16 -    return copy_from_user(dest, src, len);
    2.17 +    return __copy_from_user(dest, src, len);
    2.18  }
    2.19  
    2.20  unsigned int 
    2.21  gdb_arch_copy_to_user(void *dest, const void *src, unsigned len)
    2.22  {
    2.23 -    return copy_to_user(dest, src, len);
    2.24 +    return __copy_to_user(dest, src, len);
    2.25  }
    2.26  
    2.27  void 
     3.1 --- a/xen/arch/x86/traps.c	Wed Dec 12 11:08:21 2007 +0000
     3.2 +++ b/xen/arch/x86/traps.c	Wed Dec 12 11:27:15 2007 +0000
     3.3 @@ -783,8 +783,8 @@ asmlinkage void do_invalid_op(struct cpu
     3.4      predicate = is_kernel(bug_str.str) ? (char *)bug_str.str : "<unknown>";
     3.5      printk("Assertion '%s' failed at %.50s:%d\n",
     3.6             predicate, filename, lineno);
     3.7 +    show_execution_state(regs);
     3.8      DEBUGGER_trap_fatal(TRAP_invalid_op, regs);
     3.9 -    show_execution_state(regs);
    3.10      panic("Assertion '%s' failed at %.50s:%d\n",
    3.11            predicate, filename, lineno);
    3.12  
    3.13 @@ -912,6 +912,14 @@ static int __spurious_page_fault(
    3.14      l1_pgentry_t l1e, *l1t;
    3.15      unsigned int required_flags, disallowed_flags;
    3.16  
    3.17 +    /*
    3.18 +     * We do not take spurious page faults in IRQ handlers as we do not
    3.19 +     * modify page tables in IRQ context. We therefore bail here because
    3.20 +     * map_domain_page() is not IRQ-safe.
    3.21 +     */
    3.22 +    if ( in_irq() )
    3.23 +        return 0;
    3.24 +
    3.25      /* Reserved bit violations are never spurious faults. */
    3.26      if ( regs->error_code & PFEC_reserved_bit )
    3.27          return 0;
     4.1 --- a/xen/common/gdbstub.c	Wed Dec 12 11:08:21 2007 +0000
     4.2 +++ b/xen/common/gdbstub.c	Wed Dec 12 11:27:15 2007 +0000
     4.3 @@ -43,6 +43,7 @@
     4.4  #include <xen/smp.h>
     4.5  #include <xen/console.h>
     4.6  #include <xen/errno.h>
     4.7 +#include <asm/byteorder.h>
     4.8  
     4.9  /* Printk isn't particularly safe just after we've trapped to the
    4.10     debugger. so avoid it. */
    4.11 @@ -215,8 +216,7 @@ void
    4.12  gdb_write_to_packet_hex(unsigned long x, int int_size, struct gdb_context *ctx)
    4.13  {
    4.14      char buf[sizeof(unsigned long) * 2 + 1];
    4.15 -    int i = sizeof(unsigned long) * 2;
    4.16 -    int width = int_size * 2;
    4.17 +    int i, width = int_size * 2;
    4.18  
    4.19      buf[sizeof(unsigned long) * 2] = 0;
    4.20  
    4.21 @@ -233,6 +233,8 @@ gdb_write_to_packet_hex(unsigned long x,
    4.22          break;
    4.23      }
    4.24  
    4.25 +#ifdef __BIG_ENDIAN
    4.26 +	i = sizeof(unsigned long) * 2
    4.27      do {
    4.28          buf[--i] = hex2char(x & 15);
    4.29          x >>= 4;
    4.30 @@ -242,6 +244,17 @@ gdb_write_to_packet_hex(unsigned long x,
    4.31          buf[--i] = '0';
    4.32  
    4.33      gdb_write_to_packet(&buf[i], width, ctx);
    4.34 +#elif defined(__LITTLE_ENDIAN)
    4.35 +	i = 0;
    4.36 +	while (i < width) {
    4.37 +		buf[i++] = hex2char(x>>4);
    4.38 +		buf[i++] = hex2char(x);
    4.39 +		x >>= 8;
    4.40 +	}
    4.41 +	gdb_write_to_packet(buf, width, ctx);
    4.42 +#else
    4.43 +# error unknown endian
    4.44 +#endif
    4.45  }
    4.46  
    4.47  static int
    4.48 @@ -512,7 +525,7 @@ int
    4.49  
    4.50      if ( gdb_ctx->serhnd < 0 )
    4.51      {
    4.52 -        dbg_printk("Debugger not ready yet.\n");
    4.53 +        printk("Debugging connection not set up.\n");
    4.54          return -EBUSY;
    4.55      }
    4.56  
     5.1 --- a/xen/common/keyhandler.c	Wed Dec 12 11:08:21 2007 +0000
     5.2 +++ b/xen/common/keyhandler.c	Wed Dec 12 11:27:15 2007 +0000
     5.3 @@ -275,6 +275,7 @@ extern void perfc_reset(unsigned char ke
     5.4  
     5.5  static void do_debug_key(unsigned char key, struct cpu_user_regs *regs)
     5.6  {
     5.7 +    printk("'%c' pressed -> trapping into debugger\n", key);
     5.8      (void)debugger_trap_fatal(0xf001, regs);
     5.9      nop(); /* Prevent the compiler doing tail call
    5.10                               optimisation, as that confuses xendbg a
     6.1 --- a/xen/include/xen/gdbstub.h	Wed Dec 12 11:08:21 2007 +0000
     6.2 +++ b/xen/include/xen/gdbstub.h	Wed Dec 12 11:27:15 2007 +0000
     6.3 @@ -53,6 +53,7 @@ void gdb_write_to_packet(
     6.4      const char *buf, int count, struct gdb_context *ctx);
     6.5  void gdb_write_to_packet_hex(
     6.6      unsigned long x, int int_size, struct gdb_context *ctx);
     6.7 +    /* ... writes in target native byte order as required by gdb spec. */
     6.8  void gdb_send_packet(struct gdb_context *ctx);
     6.9  void gdb_send_reply(const char *buf, struct gdb_context *ctx);
    6.10