List Info

Thread: Locking in xenevt.c




Locking in xenevt.c
country flaguser name
United States
2007-06-06 22:58:20
It seems to me that xenevt_fpoll needs to take the same spl
and lock
as xenevt_donotify and xenevt_fread to avoid races.  In
particular, it
seems that the event handler could be called after the check
for an
empty ring but before the corresponding selrecord, thus
causing the
calling thread to miss the selnotify() and sleep when it
shouldn't.

I'd had a problem with domUs hanging while probing devices,
which I
eventually traced down to xenstored not returning from
select even
though the xenevt instance it was waiting on had an unread
event.  I
could unstick it by doing "call wakeup(selwait)"
in ddb a few times,
or by doing other things that use select, like disconnecting
and
reconnecting to the domU console.  (This also means that
enough
unrelated select()ing in the dom0 could mask the problem.)

The change, which appears to fix this problem for me, seems
fairly
obvious, but as I'm not very familiar with the code
involved, I'm
posting it here.

-- 
(let ((C call-with-current-continuation)) (apply (lambda (x
y) (x y)) (map
((lambda (r) ((C C) (lambda (s) (r (lambda l (apply (s s)
l))))))  (lambda
(f) (lambda (l) (if (null? l) C (lambda (k) (display (car
l)) ((f (cdr l))
(C k)))))))    '((#J #d #D #v #s) (#e #space #a #i
#newline)))))

  
Re: Locking in xenevt.c
user name
2007-06-08 13:25:19
On Wed, Jun 06, 2007 at 11:58:20PM -0400, Jed Davis wrote:
> It seems to me that xenevt_fpoll needs to take the same
spl and lock
> as xenevt_donotify and xenevt_fread to avoid races.  In
particular, it
> seems that the event handler could be called after the
check for an
> empty ring but before the corresponding selrecord, thus
causing the
> calling thread to miss the selnotify() and sleep when
it shouldn't.
> 
> I'd had a problem with domUs hanging while probing
devices, which I
> eventually traced down to xenstored not returning from
select even
> though the xenevt instance it was waiting on had an
unread event.  I
> could unstick it by doing "call
wakeup(selwait)" in ddb a few times,
> or by doing other things that use select, like
disconnecting and
> reconnecting to the domU console.  (This also means
that enough
> unrelated select()ing in the dom0 could mask the
problem.)
> 
> The change, which appears to fix this problem for me,
seems fairly
> obvious, but as I'm not very familiar with the code
involved, I'm
> posting it here.

I think you're right; this part of the code needs to be
protected. You patch
matches what other device drivers do in this area.
Please commit !

-- 
Manuel Bouyer <bouyerantioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la
difference
--

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )