List Info

Thread: Re: yamt-idlelwp fallout for mips/cobalt?




Re: yamt-idlelwp fallout for mips/cobalt?
country flaguser name
United States
2007-06-14 12:54:39
On Fri, 25 May 2007, Izumi Tsutsui wrote:

> The following patch makes a LOCKDEBUG kernel work,
> but I don't know if it's really correct.

   I don't think it's correct.  This was changed several
years ago
(starting with revision 1.175) because kernel threads would
start
and run with interrupts disabled.  I had the same problem
with my
amiga (m68k) not all that long ago because I had added a
raidframe
drive and was getting lots of clock skew when parity
rebuilding was
going on.  I finally figured out that the raidframe thread
was running
with interrupts blocked, and started looking at several
other ports
to see how they started kthread processes and found that
they had
the same problem, but had fixed for several years.

   Looking back in the mail archives, this seems to be an
attempt
to fix a locking against myself panic, so I suspect it's
more likely
a locking error somewhere.  I remember that the m68k port
had a similar
problem when running a DIAGNOSTIC kernel (which I had not
tested at
the time), and I tracked down a small section of code I had
missed
for the idlelwp changes.


> Index: arch/mips/mips/vm_machdep.c
>
============================================================
=======
> RCS file:
/cvsroot/src/sys/arch/mips/mips/vm_machdep.c,v
> retrieving revision 1.117
> diff -u -r1.117 vm_machdep.c
> --- arch/mips/mips/vm_machdep.c	17 May 2007 14:51:25
-0000	1.117
> +++ arch/mips/mips/vm_machdep.c	25 May 2007 14:47:42
-0000
>  -170,7 +170,9 
> 	pcb->pcb_context[MIPS_CURLWP_CARD - 16] =
(intptr_t)l2;/* S? */
> 	pcb->pcb_context[8] = (intptr_t)f;		/* SP */
> 	pcb->pcb_context[10] = (intptr_t)lwp_trampoline;/*
RA */
> +#if 0
> 	pcb->pcb_context[11] |= PSL_LOWIPL;		/* SR */
> +#endif
> #ifdef IPL_ICU_MASK
> 	pcb->pcb_ppl = 0;	/* machine dependent interrupt
mask */
> #endif
>

--
Michael L. Hitch			mhitchmontana.edu
Computer Consultant
Information Technology Center
Montana State University	Bozeman, MT	USA

Re: yamt-idlelwp fallout for mips/cobalt?
country flaguser name
United Kingdom
2007-06-14 15:37:48
On Thu, Jun 14, 2007 at 11:54:39AM -0600, Michael L. Hitch
wrote:

> On Fri, 25 May 2007, Izumi Tsutsui wrote:
> 
> >The following patch makes a LOCKDEBUG kernel work,
> >but I don't know if it's really correct.
> 
>   I don't think it's correct.  This was changed several
years ago
> (starting with revision 1.175) because kernel threads
would start
> and run with interrupts disabled.  I had the same
problem with my
> amiga (m68k) not all that long ago because I had added
a raidframe
> drive and was getting lots of clock skew when parity
rebuilding was
> going on.  I finally figured out that the raidframe
thread was running
> with interrupts blocked, and started looking at several
other ports
> to see how they started kthread processes and found
that they had
> the same problem, but had fixed for several years.

> >Index: arch/mips/mips/vm_machdep.c
>
>========================================================
===========
> >RCS file:
/cvsroot/src/sys/arch/mips/mips/vm_machdep.c,v
> >retrieving revision 1.117
> >diff -u -r1.117 vm_machdep.c
> >--- arch/mips/mips/vm_machdep.c	17 May 2007
14:51:25 -0000	1.117
> >+++ arch/mips/mips/vm_machdep.c	25 May 2007
14:47:42 -0000
> > -170,7 +170,9 
> >	pcb->pcb_context[MIPS_CURLWP_CARD - 16] =
(intptr_t)l2;/* S? */
> >	pcb->pcb_context[8] = (intptr_t)f;		/* SP */
> >	pcb->pcb_context[10] =
(intptr_t)lwp_trampoline;/* RA */
> >+#if 0
> >	pcb->pcb_context[11] |= PSL_LOWIPL;		/* SR */
> >+#endif

I'm not yet sure what we need to set into SR in this case,
but post
yamt-idlelwp, LWPs should start up at IPL_SCHED as far as MD
code is
concerned. cpu_switchto() is expected to maintain the IPL at
IPL_SCHED or
above across the switch. If I read the code correctly,
PSL_LOWIPL enables
all interrupts.

> Looking back in the mail archives, this seems to be an
attempt
> to fix a locking against myself panic, so I suspect
it's more likely
> a locking error somewhere. 

The call into lwp_startup() does an spl0(). Before it does
that, it also
unlocks the previous LWP if any. If we enable interrupts
before unlocking
the previous LWP, we can end up taking an interrupt and
trying to acquire
a spinlock that is already held.

Andrew

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )