List Info

Thread: Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :)




Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :)
country flaguser name
Australia
2008-05-10 07:28:53
On Sat, 10 May 2008, Juergen Lock wrote:

> On Thu, May 08, 2008 at 09:59:57PM +1000, Bruce Evans
wrote:

>> The message in npx.c is actually about violation of
an even more
>> fundamental invariant -- the invariant that owning
the FPU includes
>> having the TS flag clear so that DNA traps cannot
occur.  The bug in
>> kqemu seems to be mismanagement of the TS flag
related to this.  I
>> forget if it is the host or the target TS flag that
seems to be mismanaged.
>> For the target, it would take a bug in the
virtualization of the TS flag
>> to break this invariant (assuming no related bugs
in the target kernel).
>>
> Well the `fpcurthread == curthread' bug has been fixed
quite a while
> ago already, or do you mean another one?

I didn't know what is already fixed.

>> The message in amd64/machdep.c is about violation
of the invariant
>> that the kernel cannot cause DNA traps.  Spurious
DNA traps in the
>> ...
>>
> Okay I _think_ I know a little more about this now... 
kqemu itself
> doesn't use the fpu, but the guest code it runs can,
and in that case the
> DNA trap is just used for (host) lazy fpu context
switching like as if the
> code was running in userland regularly.  And I just
tested the following
> patch that should get rid of the message by calling
fpudna/npxdna directly
> (files/patch-fpucontext is the interesting part

This seems reasonable.  Is the following summary of my
understanding of
kqemu's implementation of this and your change correct?:
- kqemu runs in kernel mode on the host and needs to have
exactly the
   same effect as a DNA exception on the target.
- having exactly the same effect requires calling the host
DNA exception
   handler.
- now it uses a software int $7 (dna) to implement the
above, but this is
   not permitted in kernel mode (although the software int
could be permitted,
   it is hard to distinguish from a hardware exception for
unintentional use).
- your change makes it call the DNA trap handler directly. 
This gives the
   same effect as a permitted software int $7.  It is also
faster.

It would be better to use an official API for this, but none
exists.

> ...
> +Index: kqemu-freebsd.c
> + -33,6 +33,11 
> +
> + #include <machine/vmparam.h>
> + #include <machine/stdarg.h>
> ++#ifdef __x86_64__
> ++#include <machine/fpu.h>
> ++#else
> ++#include <machine/npx.h>
> ++#endif
> +
> + #include "kqemu-kernel.h"
> +
> + -172,6 +177,15 
> + {
> + }
> +
> ++void CDECL kqemu_loadfpucontext(unsigned long cpl)
> ++{
> ++#ifdef __x86_64__
> ++    fpudna();
> ++#else
> ++    npxdna();
> ++#endif
> ++}

Just be sure that the system state is not too different from
that of
trap() (directly below a syscall or trap from userland) when
this is
called.  Better not have any interrupts disabled or locks
held, though
I think npxdna() doesn't care.  The FPU must not be owned
already at
this point.

> ++
> + #if __FreeBSD_version < 500000
> + static int
> + curpriority_cmp(struct proc *p)

I guess kqemu duplicates this old mistake instead of calling
it because it
is static.  npxdna() is already public so it can be abused
easily ,

Bruce
_______________________________________________
freebsd-emulationfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-emu
lation
To unsubscribe, send any mail to
"freebsd-emulation-unsubscribefreebsd.org"

Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :)
user name
2008-05-11 05:33:56
On Sat, May 10, 2008 at 10:28:53PM +1000, Bruce Evans
wrote:
> On Sat, 10 May 2008, Juergen Lock wrote:
> 
>> On Thu, May 08, 2008 at 09:59:57PM +1000, Bruce
Evans wrote:

>>> The message in amd64/machdep.c is about
violation of the invariant
>>> that the kernel cannot cause DNA traps. 
Spurious DNA traps in the
>>> ...
>>> 
>> Okay I _think_ I know a little more about this
now...  kqemu itself
>> doesn't use the fpu, but the guest code it runs
can, and in that case the
>> DNA trap is just used for (host) lazy fpu context
switching like as if the
>> code was running in userland regularly.  And I just
tested the following
>> patch that should get rid of the message by calling
fpudna/npxdna directly
>> (files/patch-fpucontext is the interesting part
> 
> This seems reasonable.  Is the following summary of my
understanding of
> kqemu's implementation of this and your change
correct?:
> - kqemu runs in kernel mode on the host and needs to
have exactly the
>   same effect as a DNA exception on the target.
> - having exactly the same effect requires calling the
host DNA exception
>   handler.
> - now it uses a software int $7 (dna) to implement the
above, but this is
>   not permitted in kernel mode (although the software
int could be 
> permitted,
>   it is hard to distinguish from a hardware exception
for unintentional 
> use).
> - your change makes it call the DNA trap handler
directly.  This gives the
>   same effect as a permitted software int $7.  It is
also faster.
> 
Yup thats basically it.

> It would be better to use an official API for this, but
none exists.
> 
 
>> ...
>> +Index: kqemu-freebsd.c
>> + -33,6 +33,11 
>> +
>> + #include <machine/vmparam.h>
>> + #include <machine/stdarg.h>
>> ++#ifdef __x86_64__
>> ++#include <machine/fpu.h>
>> ++#else
>> ++#include <machine/npx.h>
>> ++#endif
>> +
>> + #include "kqemu-kernel.h"
>> +
>> + -172,6 +177,15 
>> + {
>> + }
>> +
>> ++void CDECL kqemu_loadfpucontext(unsigned long
cpl)
>> ++{
>> ++#ifdef __x86_64__
>> ++    fpudna();
>> ++#else
>> ++    npxdna();
>> ++#endif
>> ++}
> 
> Just be sure that the system state is not too different
from that of
> trap() (directly below a syscall or trap from userland)
when this is
> called.  Better not have any interrupts disabled or
locks held, though
> I think npxdna() doesn't care.  The FPU must not be
owned already at
> this point.
> 
 Yes, all of that is true.

>> ++
>> + #if __FreeBSD_version < 500000
>> + static int
>> + curpriority_cmp(struct proc *p)
> 
> I guess kqemu duplicates this old mistake instead of
calling it because it
> is static.  npxdna() is already public so it can be
abused easily ,

 Well this (curpriority_cmp) is code for 4.x anyway.  (Yes I
guess I
could axe it, but maybe there are still some poor souls out
there that
still need it...)

	Juergen
_______________________________________________
freebsd-emulationfreebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-emu
lation
To unsubscribe, send any mail to
"freebsd-emulation-unsubscribefreebsd.org"

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )