List Info

Thread: Re: recent dom0 kernels reboot on loading?




Re: recent dom0 kernels reboot on loading?
user name
2007-08-23 07:50:08
On Wed, 15 Aug 2007 12:56:11 +1000 Daniel Carosone wrote:

> I tried to upgrade my dom0 kernel:

> /netbsd_dom0
>         NetBSD 4.99.28 (_oenone_dom0_) #42: Wed Aug 15
12:24:58 EST 2007
> /netbsd_dom0.old
>         NetBSD 4.99.25 (_oenone_dom0_) #36: Thu Jul 26
18:21:00 EST 2007

> but the new one seems to cause the machine to reboot on
loading
> without generating any output past the Xen VMM dmesg.

> Anyone else seeing anything like this?

I have experienced the same problem with yesterday current
(previous kernel, a
half year old current, booted ok). In my grub config I
have:

  module (hd0,a)/netbsd-XEN3_DOM0 root=/dev/hda1 ro
console=tty0

after removing "console=tty0" the system dose not
go in reboot any more, but
boot process stops somewhere later -- the host is not
reached by
network. Unfortunately, I do not see any netbsd kernel
messages, it appears
they go to serial console. I have no serial cable and can't
say what was going
on exactly, but I suspect it could not mount root device.
After removing
"root=/dev/hda1" the system booted and I have it
alive now, although without vga
console. XEN3_DOMU booted ok.

-- 
Mikolaj Golub

Re: recent dom0 kernels reboot on loading?
user name
2007-08-31 08:02:27
May be this will be helpful for resolving the issue....
Booting xen kernel
with serial console and DOM0 with vga, I get at serial
console:

(XEN) Xen is relinquishing VGA console.                     
                
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times
to switch input to Xen).
(XEN) traps.c:390:d0 Unhandled general protection fault
fault/trap [#13] in domain 0 on VCPU 0 [ec=0000]
(XEN) domain_crash_sync called from entry.S (ff16a0fd)      
 
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:                   
 
(XEN) ----[ Xen-3.1.0  x86_32  debug=y  Not tainted ]----   
 
(XEN) CPU:    0                                             
 
(XEN) EIP:    e019:[<c04d4f84>]                       
       
(XEN) EFLAGS: 00000246   CONTEXT: guest                     
 
(XEN) eax: c0964264   ebx: 000003cf   ecx: 00000001   edx:
c0964264
(XEN) esi: 00000000   edi: c0964260   ebp: c0a66b0c   esp:
c0a66ad0
(XEN) cr0: 8005003b   cr4: 000006d0   cr3: 1ea63000   cr2:
00000000
(XEN) ds: e021   es: e021   fs: 0000   gs: 0000   ss: e021  
cs: e019
(XEN) Guest stack trace from esp=c0a66ad0:
(XEN)    00000000 c04d4f84 0001e019 00010046 c043ae6b
c0964264 6f63206f 6c6f736e
(XEN)    74743d65 00003079 00000000 c0964264 000003cf
000003c0 c0964260 c0a66b4c
(XEN)    c043baea 00000000 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000
0000002d 00000000 c0a66b7c
(XEN)    c04fcefa c0964260 000003c0 00000010 00000000
00000000 00000000 c0964260
(XEN)    000003c0 00000000 00000010 c0a66bbc c02e488a
00000000 000003c0 00000010
(XEN)    00000000 c0a66bac 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    00000001 00000000 00000001 c0a66bfc c02e3561
00000000 00000001 00000000
(XEN)    00000000 00000000 c0964340 c0a66c0c c043bde7
c096434c c0848d1f 00000009
(XEN)    c0a66c24 00000001 ffffffff c0a66c4c c04fd7e7
00000000 00000001 ffffffff
(XEN)    00000001 c082ff5e 00000000 ffffffff c08ddfe0
30797474 00000000 00000000
(XEN)    00000000 00000000 00000006 00000001 00000000
00100000 00000002 c0a66c7c
(XEN)    c04c9eee c0920f20 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    c0a63c10 00a63000 c0a6c000 00000000 c01001df
c0a6d000 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    00000000 00000000 00000000 00a63000 00000000
00000000 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
(XEN)    00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

-- 
Mikolaj Golub

Re: recent dom0 kernels reboot on loading?
user name
2007-09-03 02:27:46
On Fri, 31 Aug 2007 21:01:39 +0200 Manuel Bouyer wrote:

 MB> Next would be to find where 0xc04d4f84 is in the
kernel. The easiest is to
 MB> build a kernel with makeoptions   
DEBUG="-g",
 MB> reboot with it a see what the EIP is when it crash.
Then use
 MB> gdb on netbsd.gdb to get the source info:
 MB> list *0xc04d4f84 (or whatever EIP is when the debug
kernel crash).

Sorry for delay -- I had no access to xen box during
weekend.

Today I have tried it, but without success.

netbsd-XEN3_DOM0.gdb is built with -g option. 

Crash info:

(XEN) ----[ Xen-3.1.0  x86_32  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) EIP:    e019:[<c04cc344>]
(XEN) EFLAGS: 00000246   CONTEXT: guest
(XEN) eax: c0955404   ebx: 000003cf   ecx: 00000001   edx:
c0955404
(XEN) esi: 00000000   edi: c0955400   ebp: c0a57b0c   esp:
c0a57ad0
(XEN) cr0: 8005003b   cr4: 000006d0   cr3: 1ea54000   cr2:
00000000
(XEN) ds: e021   es: e021   fs: 0000   gs: 0000   ss: e021  
cs: e019
(XEN) Guest stack trace from esp=c0a57ad0:
(XEN)    00000000 c04cc344 0001e019 00010046 c043535b
c0955404 00000000 00000000
(XEN)    00000000 00000000 00000000 c0955404 000003cf
000003c0 c0955400 c0a57b4c

gdb session:

-bash-3.2$ gdb netbsd-XEN3_DOM0.gdb 
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public
License, and you are
welcome to change it and/or distribute copies of it under
certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show
warranty" for details.
This GDB was configured as "i386--netbsdelf"...
(gdb) list *0xc04cc344
No source file for address 0xc04cc344.

This is just to ensure I try kernel with gdb symbols:

(gdb) list
238     
239     /*
240      * System startup; initialize the world, create
process 0, mount root
241      * filesystem, and fork to create init and
pagedaemon.  Most of the
242      * hard work is done in the lower-level
initialization routines including
243      * startup(), which does memory initialization and
autoconfiguration.
244      */
245     void
246     main(void)
247     {

-- 
Mikolaj Golub

Re: recent dom0 kernels reboot on loading?
user name
2007-09-17 07:57:39
On Mon, 03 Sep 2007 10:27:46 +0300 Mikolaj Golub wrote:

 MG> Crash info:

 MG> (XEN) ----[ Xen-3.1.0  x86_32  debug=n  Not tainted
]----
 MG> (XEN) CPU:    0
 MG> (XEN) EIP:    e019:[<c04cc344>]
 MG> (XEN) EFLAGS: 00000246   CONTEXT: guest
 MG> (XEN) eax: c0955404   ebx: 000003cf   ecx: 00000001 
 edx: c0955404
 MG> (XEN) esi: 00000000   edi: c0955400   ebp: c0a57b0c 
 esp: c0a57ad0
 MG> (XEN) cr0: 8005003b   cr4: 000006d0   cr3: 1ea54000 
 cr2: 00000000
 MG> (XEN) ds: e021   es: e021   fs: 0000   gs: 0000  
ss: e021   cs: e019
 MG> (XEN) Guest stack trace from esp=c0a57ad0:
 MG> (XEN)    00000000 c04cc344 0001e019 00010046
c043535b c0955404 00000000 00000000
 MG> (XEN)    00000000 00000000 00000000 c0955404
000003cf 000003c0 c0955400 c0a57b4c

 MG> gdb session:

 MG> -bash-3.2$ gdb netbsd-XEN3_DOM0.gdb 
 MG> GNU gdb 6.5
 MG> Copyright (C) 2006 Free Software Foundation, Inc.
 MG> GDB is free software, covered by the GNU General
Public License, and you are
 MG> welcome to change it and/or distribute copies of it
under certain conditions.
 MG> Type "show copying" to see the
conditions.
 MG> There is absolutely no warranty for GDB.  Type
"show warranty" for details.
 MG> This GDB was configured as
"i386--netbsdelf"...
 MG> (gdb) list *0xc04cc344
 MG> No source file for address 0xc04cc344.

(gdb) disassemble 0xc04cc344
Dump of assembler code for function mutex_enter:
0xc04cc340 <mutex_enter+0>:     mov    0x4(%esp),%edx
0xc04cc344 <mutex_enter+4>:     mov    %fs:0x18,%ecx
0xc04cc34b <mutex_enter+11>:    xor    %eax,%eax
0xc04cc34d <mutex_enter+13>:    cmpxchg
%ecx,0x0(%edx)
0xc04cc351 <mutex_enter+17>:    jne,pn 0xc04161b0
<mutex_vector_enter>
0xc04cc358 <mutex_enter+24>:    ret    
0xc04cc359 <mutex_enter+25>:    lea    0x0(%esi),%esi
End of assembler dump.

Am I right interpreting it that `list *0xc04cc344' did not
output because at
address 0xc04cc344 there is assembler function mutex_enter?
So, crush is at
mutex_enter?

-- 
Mikolaj Golub

Re: recent dom0 kernels reboot on loading?
user name
2007-09-17 14:30:13
On Mon, Sep 17, 2007 at 03:57:39PM +0300, Mikolaj Golub
wrote:
> 
> On Mon, 03 Sep 2007 10:27:46 +0300 Mikolaj Golub
wrote:
> 
>  MG> Crash info:
> 
>  MG> (XEN) ----[ Xen-3.1.0  x86_32  debug=n  Not
tainted ]----
>  MG> (XEN) CPU:    0
>  MG> (XEN) EIP:    e019:[<c04cc344>]
>  MG> (XEN) EFLAGS: 00000246   CONTEXT: guest
>  MG> (XEN) eax: c0955404   ebx: 000003cf   ecx:
00000001   edx: c0955404
>  MG> (XEN) esi: 00000000   edi: c0955400   ebp:
c0a57b0c   esp: c0a57ad0
>  MG> (XEN) cr0: 8005003b   cr4: 000006d0   cr3:
1ea54000   cr2: 00000000
>  MG> (XEN) ds: e021   es: e021   fs: 0000   gs: 0000
  ss: e021   cs: e019
>  MG> (XEN) Guest stack trace from esp=c0a57ad0:
>  MG> (XEN)    00000000 c04cc344 0001e019 00010046
c043535b c0955404 00000000 00000000
>  MG> (XEN)    00000000 00000000 00000000 c0955404
000003cf 000003c0 c0955400 c0a57b4c
> 
>  MG> gdb session:
> 
>  MG> -bash-3.2$ gdb netbsd-XEN3_DOM0.gdb 
>  MG> GNU gdb 6.5
>  MG> Copyright (C) 2006 Free Software Foundation,
Inc.
>  MG> GDB is free software, covered by the GNU
General Public License, and you are
>  MG> welcome to change it and/or distribute copies
of it under certain conditions.
>  MG> Type "show copying" to see the
conditions.
>  MG> There is absolutely no warranty for GDB.  Type
"show warranty" for details.
>  MG> This GDB was configured as
"i386--netbsdelf"...
>  MG> (gdb) list *0xc04cc344
>  MG> No source file for address 0xc04cc344.
> 
> (gdb) disassemble 0xc04cc344
> Dump of assembler code for function mutex_enter:
> 0xc04cc340 <mutex_enter+0>:     mov   
0x4(%esp),%edx
> 0xc04cc344 <mutex_enter+4>:     mov   
%fs:0x18,%ecx
> 0xc04cc34b <mutex_enter+11>:    xor    %eax,%eax
> 0xc04cc34d <mutex_enter+13>:    cmpxchg
%ecx,0x0(%edx)
> 0xc04cc351 <mutex_enter+17>:    jne,pn 0xc04161b0
<mutex_vector_enter>
> 0xc04cc358 <mutex_enter+24>:    ret    
> 0xc04cc359 <mutex_enter+25>:    lea   
0x0(%esi),%esi
> End of assembler dump.
> 
> Am I right interpreting it that `list *0xc04cc344' did
not output because at
> address 0xc04cc344 there is assembler function
mutex_enter? So, crush is at
> mutex_enter?

Yes, on the mov %fs:0x18,%ecx intruction, or maybe the
previous one.
I'm not sure if it's OK for %fs to be 0 at this point. To me
it looks like
it should not.

Could you try to see what c043535b and c0955404 points to in
your sources ?

-- 
Manuel Bouyer <bouyerantioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la
difference
--

Re: recent dom0 kernels reboot on loading?
user name
2007-09-18 00:54:08
On Mon, 17 Sep 2007 21:30:13 +0200 Manuel Bouyer wrote:

 MB> On Mon, Sep 17, 2007 at 03:57:39PM +0300, Mikolaj
Golub wrote:
 >> 
 >> On Mon, 03 Sep 2007 10:27:46 +0300 Mikolaj Golub
wrote:
 >> 
 >>  MG> Crash info:
 >> 
 >>  MG> (XEN) ----[ Xen-3.1.0  x86_32  debug=n 
Not tainted ]----
 >>  MG> (XEN) CPU:    0
 >>  MG> (XEN) EIP:    e019:[<c04cc344>]
 >>  MG> (XEN) EFLAGS: 00000246   CONTEXT: guest
 >>  MG> (XEN) eax: c0955404   ebx: 000003cf   ecx:
00000001   edx: c0955404
 >>  MG> (XEN) esi: 00000000   edi: c0955400   ebp:
c0a57b0c   esp: c0a57ad0
 >>  MG> (XEN) cr0: 8005003b   cr4: 000006d0   cr3:
1ea54000   cr2: 00000000
 >>  MG> (XEN) ds: e021   es: e021   fs: 0000   gs:
0000   ss: e021   cs: e019
 >>  MG> (XEN) Guest stack trace from
esp=c0a57ad0:
 >>  MG> (XEN)    00000000 c04cc344 0001e019
00010046 c043535b c0955404 00000000 00000000
 >>  MG> (XEN)    00000000 00000000 00000000
c0955404 000003cf 000003c0 c0955400 c0a57b4c
 >> 
 >>  MG> gdb session:
 >> 
 >>  MG> -bash-3.2$ gdb netbsd-XEN3_DOM0.gdb 
 >>  MG> GNU gdb 6.5
 >>  MG> Copyright (C) 2006 Free Software
Foundation, Inc.
 >>  MG> GDB is free software, covered by the GNU
General Public License, and you are
 >>  MG> welcome to change it and/or distribute
copies of it under certain conditions.
 >>  MG> Type "show copying" to see the
conditions.
 >>  MG> There is absolutely no warranty for GDB. 
Type "show warranty" for details.
 >>  MG> This GDB was configured as
"i386--netbsdelf"...
 >>  MG> (gdb) list *0xc04cc344
 >>  MG> No source file for address 0xc04cc344.
 >> 
 >> (gdb) disassemble 0xc04cc344
 >> Dump of assembler code for function mutex_enter:
 >> 0xc04cc340 <mutex_enter+0>:     mov   
0x4(%esp),%edx
 >> 0xc04cc344 <mutex_enter+4>:     mov   
%fs:0x18,%ecx
 >> 0xc04cc34b <mutex_enter+11>:    xor   
%eax,%eax
 >> 0xc04cc34d <mutex_enter+13>:    cmpxchg
%ecx,0x0(%edx)
 >> 0xc04cc351 <mutex_enter+17>:    jne,pn
0xc04161b0 <mutex_vector_enter>
 >> 0xc04cc358 <mutex_enter+24>:    ret    
 >> 0xc04cc359 <mutex_enter+25>:    lea   
0x0(%esi),%esi
 >> End of assembler dump.
 >> 
 >> Am I right interpreting it that `list *0xc04cc344'
did not output because at
 >> address 0xc04cc344 there is assembler function
mutex_enter? So, crush is at
 >> mutex_enter?

 MB> Yes, on the mov %fs:0x18,%ecx intruction, or maybe
the previous one.
 MB> I'm not sure if it's OK for %fs to be 0 at this
point. To me it looks like
 MB> it should not.

 MB> Could you try to see what c043535b and c0955404
points to in your sources ?

(gdb) list *0xc043535b
0xc043535b is in extent_alloc_region_descriptor
(/usr/src/sys/kern/subr_extent.c:148).
143             /*
144              * XXX Make a static, create-time flags
word, so we don't
145              * XXX have to lock to read it!
146              */
147             mutex_enter(&ex->ex_lock);
148             exflags = ex->ex_flags;
149             mutex_exit(&ex->ex_lock);
150     
151             if (exflags & EXF_FIXED) {
152                     struct extent_fixed *fex = (struct
extent_fixed *)ex;
(gdb) disassemble 0xc043535b
Dump of assembler code for function
extent_alloc_region_descriptor:
....
0xc0435356 <extent_alloc_region_descriptor+38>: call  
0xc04cc340 <mutex_enter>
0xc043535b <extent_alloc_region_descriptor+43>: mov   
0x24(%edi),%ebx
0xc043535e <extent_alloc_region_descriptor+46>: mov   
0xfffffff0(%ebp),%eax
0xc0435361 <extent_alloc_region_descriptor+49>: mov   
%eax,(%esp)
0xc0435364 <extent_alloc_region_descriptor+52>: call  
0xc04cc360 <mutex_exit>
....
(gdb) list *0xc0955404
No source file for address 0xc0955404.
(gdb) disassemble 0xc0955404
Dump of assembler code for function ioport_ex_storage:
0xc0955400 <ioport_ex_storage+0>:       add   
%al,(%eax)
0xc0955402 <ioport_ex_storage+2>:       add   
%al,(%eax)
0xc0955404 <ioport_ex_storage+4>:       add   
%al,(%eax)
0xc0955406 <ioport_ex_storage+6>:       add   
%al,(%eax)
0xc0955408 <ioport_ex_storage+8>:       add   
%al,(%eax)
.....

-- 
Mikolaj Golub

Re: recent dom0 kernels reboot on loading?
user name
2007-09-23 08:52:36
On Mon, Sep 17, 2007 at 09:30:13PM +0200, Manuel Bouyer
wrote:
> [...]
> > (gdb) disassemble 0xc04cc344
> > Dump of assembler code for function mutex_enter:
> > 0xc04cc340 <mutex_enter+0>:     mov   
0x4(%esp),%edx
> > 0xc04cc344 <mutex_enter+4>:     mov   
%fs:0x18,%ecx
> > 0xc04cc34b <mutex_enter+11>:    xor   
%eax,%eax
> > 0xc04cc34d <mutex_enter+13>:    cmpxchg
%ecx,0x0(%edx)
> > 0xc04cc351 <mutex_enter+17>:    jne,pn
0xc04161b0 <mutex_vector_enter>
> > 0xc04cc358 <mutex_enter+24>:    ret    
> > 0xc04cc359 <mutex_enter+25>:    lea   
0x0(%esi),%esi
> > End of assembler dump.
> > 
> > Am I right interpreting it that `list *0xc04cc344'
did not output because at
> > address 0xc04cc344 there is assembler function
mutex_enter? So, crush is at
> > mutex_enter?
> 
> Yes, on the mov %fs:0x18,%ecx intruction, or maybe the
previous one.
> I'm not sure if it's OK for %fs to be 0 at this point.
To me it looks like
> it should not.

So %fs has to point to a segment descriptor pointing to the
cpu_info for
the local CPU and we're trying to use it before it was
initialised.
Basically consinit() has to be called after initgdt(), the
attached patch
does it (it calls initgdt() ASAP and then consinit, because
consinit()
has to be called very early too).

Can someone please try this patch and see if it solves the
problem ?
It doesn't seem to have bad effects on my systems, but I
didn't see
the crash either before ...

-- 
Manuel Bouyer <bouyerantioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la
difference
--

  
Re: recent dom0 kernels reboot on loading?
user name
2007-09-24 03:18:19
On Sun, 23 Sep 2007 15:52:36 +0200 Manuel Bouyer wrote:

 MB> Can someone please try this patch and see if it
solves the problem ?
 MB> It doesn't seem to have bad effects on my systems,
but I didn't see
 MB> the crash either before ...

Yes, it works fine for me too. Thank you.

Below are some effects I am faced with new system but it is
not related to
your patch.

1) In recent versions of xen kernel it appears some changes
was made with
console driver. Before upgrade the xen kernel booted in text
mode (I hope I
call it correctly, standard 80x25 text screen or something).
Now it boots in
some vga mode -- fonts look rather ugly and have another
resolution. When
netbsd kernel boots it uses only higher part of screen and
it looks
ugly. Workaround for me to get old behavior: specify
destination for xen
console as com1 only (console=com1) and use console=tty0 for
netbsd
kernel. Then xen does not switch in this ugly vga mode and
netbsd kernel
output looks like it was earlier. I suppose there is right
way to get the same
effect, may be some boot options, but I can't find those
ones in xen manual.

2) During boot I see these messages from xen kernel on the
console:

(XEN) mm.c:503:d0 Could not get page ref for pfn bff02
(XEN) mm.c:2324:d0 Could not get page for normal update
(XEN) mm.c:503:d0 Could not get page ref for pfn bff02
(XEN) mm.c:2324:d0 Could not get page for normal update
(XEN) mm.c:503:d0 Could not get page ref for pfn bff02
(XEN) mm.c:2324:d0 Could not get page for normal update
(XEN) mm.c:503:d0 Could not get page ref for pfn bff02
(XEN) mm.c:2324:d0 Could not get page for normal update
(XEN) mm.c:503:d0 Could not get page ref for pfn bff02
(XEN) mm.c:2324:d0 Could not get page for normal update

Although it appears it does not hurt.

-- 
Mikolaj Golub

Re: recent dom0 kernels reboot on loading?
user name
2007-09-24 09:47:17
On Mon, 24 Sep 2007 10:09:57 -0400 Thor Lancelot Simon
wrote:

 TLS> You can put "vga=text-80x25" on the
xen.gz line of your grub configuration.

Oh, now I see description of this option in users' manual
that goes with
xen-3.1.0-src.tgz. Earlier I looked in wrong place (manual
for xen-3.0).

Thank you.

-- 
Mikolaj Golub

[1-9]

about | contact  Other archives ( Real Estate discussion Medical topics )