List Info

Thread: Multiprocessing question.




Multiprocessing question.
user name
2006-10-06 16:01:59
Hello Folks,

I am currently working on CMUCL to build a search system for
an
airline database. 

To improve performance, batches of requests are done in
parallel. This
was my first time using the CMUCL MP.  My starting point was
<URL:http://www.trakt7.net/cmucl%20and%20multiprocessing>

I have found the CMUCL is extremely stable and very fast to
work
with. Thank you guys for this wonderful software.

But I have been having several hard to trace problems in 2
areas,
namely, MP and sockets.  This is more likely due to my lack
of
know-how. I would be very grateful if you could help.

Firstly, I am using trivial-sockets to make HTTP
connections. We have
seen that whenever we try to connect to a system which is
down, or
when my interface has gone down, CMUCL goes into continuous
GC taking
up 99% CPU.  This is when MP is initialised.

After we abort from such errors and try to run anything I
get the
following message.


Error in function LISP::ASSERT-ERROR:
   The assertion (NOT MULTIPROCESSING:INHIBIT-SCHE
DULING*) failed.
   [Condition of type SIMPLE-ERROR]

Restarts:
  0: [CONTINUE] Retry assertion.
  1: [ABORT-REQUEST] Abort handling SLIME request.
  2: [CONTINUE] Return NIL from load of
"home:.cmucl-init".
  3: [ABORT] Skip remaining initializations.

Backtrace:
  0: (LISP::ASSERT-ERROR (NOT MULTIPROCESSING:INHIBIT-SCHE
DULING*) NIL NIL)
  1: (MULTIPROCESSING:PROCESS-WAIT-WITH-TIMEOUT
"Waiting for Final Results" 55 #<Closure Over
Function "DEFUN AIR-SEARCH" >)
      Locals:
        MULTIPROCESSING::ARGS = NIL
        MULTIPROCESSING::PREDICATE = #<Closure Over
Function "DEFUN AIR-SEARCH" >
        MULTIPROCESSING::TIMEOUT = 55
        MULTIPROCESSING::WHOSTATE = "Waiting for Final
Results"



To manage the data between the different parallel requests I
am using
a global hash table with a lock.  To wait for all the
processes to
complete I have a counter locally and also one in the global
hash
table which the processes on completion increment.

I use process-wait-with-timeout to keep checking the values
of the
counter in the global hash table.  Is this a good way for a
process to
wait ?

I will be gratefull for any insights into MP.

thanks,
quasi

-- 
quasi

Utopia Unlimited!

Multiprocessing question.
user name
2006-10-16 18:35:04
>>>>> "qu" == quasi 
<quasilistsgmail.com> writes:

  qu> Firstly, I am using trivial-sockets to make HTTP
connections. We have
  qu> seen that whenever we try to connect to a system
which is down, or
  qu> when my interface has gone down, CMUCL goes into
continuous GC taking
  qu> up 99% CPU.  This is when MP is initialised.

  this sounds like a bug in the interaction between CMUCL's
network
  code and the SERVE-EVENT facility. Probably the file
descriptor
  corresponding to the failed network connection has not
been marked
  as being inactive, and select() is being called
continually.

  Are you able to provide a simple test case, preferably
using CMUCL
  builtin functions rather than trivial-socket calls?
  
-- 
Eric Marsden


Multiprocessing question.
user name
2006-10-18 10:03:36
On 17 Oct 2006, Eric Marsden spake thusly:

>>>>>> "qu" == quasi 
<quasilistsgmail.com> writes:
>
> qu> Firstly, I am using trivial-sockets to make HTTP
connections. We
> qu> have seen that whenever we try to connect to a
system which is
> qu> down, or when my interface has gone down, CMUCL
goes into
> qu> continuous GC taking up 99% CPU.  This is when
MP is
> qu> initialised.
>
> this sounds like a bug in the interaction between
CMUCL's network
> code and the SERVE-EVENT facility. Probably the file
descriptor
> corresponding to the failed network connection has not
been marked
> as being inactive, and select() is being called
continually.
>
> Are you able to provide a simple test case, preferably
using CMUCL
> builtin functions rather than trivial-socket calls?

I am trying to build one.  But it is a bit complicated for
me.
I have observed another case where the network connection is
ok but
the function which does the parsing of the data from that
connection
throws an error - parse-integer in specific - the same thing
happens. But it does not seem to happen when I tried to
replicate it
for a single process.  I also tried spawning about 50
processes which
sleep for a random time and then throw an error
(parse-integer nil) to
see if it had something to do with too many processes
throwing
errors.  All I found out was slime cannot handle it. 

I will try with the CMUCL built in functions for the socket
example as
you suggest.


thanks for responding.

-- 
quasi

Utopia Unlimited!

Multiprocessing question.
user name
2006-10-23 13:46:24
On 17 Oct 2006, Eric Marsden spake thusly:

>>>>>> "qu" == quasi 
<quasilistsgmail.com> writes:
>
> this sounds like a bug in the interaction between
CMUCL's network
> code and the SERVE-EVENT facility. Probably the file
descriptor
> corresponding to the failed network connection has not
been marked
> as being inactive, and select() is being called
continually.
>
> Are you able to provide a simple test case, preferably
using CMUCL
> builtin functions rather than trivial-socket calls?


Rob Warnock on c.l.l seems to have had a similar problem  a
few years
back.  Foll is a link to his original post.  Maybe it will
give you
some idea - I dont understand all of it completely.

http://groups.google.com/group/comp.lang.lisp/
msg/ee284119a99c0d68

thanks
-- 
quasi

Utopia Unlimited!

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )