List Info

Thread: Parallel Controller ?




Parallel Controller ?
user name
2006-02-20 07:47:49
Thank you for your replying .

> The simplest answer to your situation would be to split
your 
> application up into separate controllers.  If your 
> application looks like:
> 
> Controller: PR1, PR2, PR3
> 
> and PR3 needs the corpus-wide results from 1 and 2,
then you 
> create two
> controllers:
> 
> ControllerA: PR1, PR2
> ControllerB: PR3
> 
> and run these two controllers over the corpus one after
the other.

Yes,but some basic calculation needs corpus wide
accumlation.
For example the TF/IDF PR must be separated into PR2(DF
calculation)
 PR3(TF/IDF calculation).
I think it is better that parallel controller or mode would
integrated in gate classes.

> I suppose a serial controller that runs PR1 on every 
> document, then PR2 on ever document, etc. could be
useful in 
> some scenarios, but when you're processing a corpus in
a 
> datastore this would generate huge overhead in having
to load 
> and save each document once per PR rather than just
once per 
> application.

Yes, document by document processing is better high speed
processng.
But in above case we can't avoid the 2 phase calculation in
any configulations.

> Truly parallel processing - running the same PR
instance over 
> two documents at the same time - isn't possible in the
GATE 
> architecture, as processing resources are (by design)
not 
> thread safe.  You could get a limited amount of
parallelism 
> by creating two copies of ControllerA and passing half
the 
> corpus to each one, for example.  "Save
application state" is 
> very handy for this purpose.

I didn't mean truly parallel processing.
Of cource I hope multi thread manager in GATE because
we are about to have truly multi-core CPUs.

> Ian
> 
> -- 
> Ian Roberts               | Department of Computer
Science
> i.robertsdcs.shef.ac.uk  | University of Sheffield,
UK
> 
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )