FWIW: Measuring CPU time in parallel is fairly meaningless;
the only
reasonable measurement is wall clock time. There's too much
other
stuff happening in parallel communications that are not
accounted for
by user CPU time (e.g., communications and time spent in the
kernel).
On Dec 15, 2007, at 4:30 AM, Scientific Computing, ISSC
wrote:
> we are implementing the sparse matrix (n x n) vector (n
x 1)
> multiplication in parallel where matrix is divided
column wise
> (equally
> divided columns) and vector is divided accordingly.
Each processor
> performs local matrix-vector multiplication. A vector
of size (n x
> 1) is
> generated on each processor. To get the final resultant
vector, all
> the
> local vectors are summed.
> We have used MPI_Allreduce to collect the final
result.
> The matrix that we are processing is 39601 x 39601.
> The vector of size 39601 is summed using
MPI_Allreduce.
> We run this code for different no. of processors.
> We get a very strange results.
> The time required for processors 1 - 19 goes on
increasing and
> decreases
> suddenly for no. of processors 20. After that time
remains constant.
>
> # Processors Time for MV
> 1 0.145209
> 2 0.123415
> 3 0.142032
> 4 0.153569
> 5 0.154709
> 6 0.167946
> 7 0.177953
> 8 0.195782
> 9 0.190688
> 10 0.203196
> 12 0.224951
> 14 0.252618
> 16 0.262348
> 17 0.271355
> 18 0.286621
> 19 0.298761
> 20 0.105971
> 22 0.102517
> 24 0.105836
> 26 0.102807
> 28 0.104299
> 30 0.105835
> 32 0.10602
>
>
> We are calculating time using cpu_time (Fortran).
>
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|