List Info

Thread: prod() performance question




prod() performance question
country flaguser name
United States
2007-03-22 10:13:10

I’ve been testing the performance of ublas̵7;s matrix multiplication. I encountered a behavior that I don’t quite understand. Consider the following code snippet:

 

Ublas::matrix<double> m1(100, 100), C(100, 100);

Ublas::Prod(m1, m1)

c = ublas::prod(m1, m1)

 

The second line and the third line are identical, except that in the third line I assign the result of the product operation to a pre-sized matrix. The second line appears to take almost no time to execute (0 millisecs, that is). The third line takes a non-zero amount of time to execute, however.

 

What’;s going on here? Is the product operation super crazy fast, but the matrix assignment part slow?

 

--Steve

Re: prod() performance question
country flaguser name
United States
2007-03-22 10:38:37
I'm thinking that your compiler is smart enough not to do
anything on the second 
line, because the result isn't used. Makes sense, no?

--nico

Gross, Steve wrote:
> I’ve been testing the performance of ublas’s matrix
multiplication. I 
> encountered a behavior that I don’t quite understand.
Consider the 
> following code snippet:
> 
>  
> 
> Ublas::matrix<double> m1(100, 100), C(100, 100);
> 
> Ublas::Prod(m1, m1)
> 
> c = ublas::prod(m1, m1)
> 
>  
> 
> The second line and the third line are identical,
except that in the 
> third line I assign the result of the product operation
to a pre-sized 
> matrix. The second line appears to take almost no time
to execute (0 
> millisecs, that is). The third line takes a non-zero
amount of time to 
> execute, however.
> 
>  
> 
> What’s going on here? Is the product operation super
crazy fast, but the 
> matrix assignment part slow?
> 
>  
> 
> --Steve
> 
> 


-- 
Nico Galoppo        UNC-CH PhD. student        http://www.ngaloppo.org
                       +1-919-942-4388

_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

Re: prod() performance question
country flaguser name
United States
2007-03-22 10:39:42
BTW, it really pays off to use the ATLAS bindings for this
kind of dense 
product. It is orders of magnitude faster than the default
ublas implementation.

--nico

Gross, Steve wrote:
> Ublas::matrix<double> m1(100, 100), C(100, 100);
> 
> Ublas::Prod(m1, m1)
> 
> c = ublas::prod(m1, m1)


-- 
Nico Galoppo        UNC-CH PhD. student        http://www.ngaloppo.org
                       +1-919-942-4388

_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

Re: prod() performance question
country flaguser name
United States
2007-03-22 10:44:13
> BTW, it really pays off to use the ATLAS bindings for
this 
> kind of dense 
> product. It is orders of magnitude faster than the
default 
> ublas implementation.

Can you show me a simple example of how that would work?

_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

Re: prod() performance question
country flaguser name
United States
2007-03-22 10:46:35
> I'm thinking that your compiler is smart enough not to
do
> anything on the second 
> line, because the result isn't used. Makes sense, no?

That's a good point--I keep forgetting that compilers are
smarter than me 

--Steve

_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

Re: prod() performance question
user name
2007-03-22 14:10:33
> I've been testing the performance of ublas's matrix
multiplication. I
> encountered a behavior that I don't quite understand.
Consider the following
> code snippet:
>
>
>
> Ublas::matrix<double> m1(100, 100), C(100, 100);
>
> Ublas::Prod(m1, m1)
>
> c = ublas::prod(m1, m1)
>
>
>
> The second line and the third line are identical,
except that in the third
> line I assign the result of the product operation to a
pre-sized matrix. The
> second line appears to take almost no time to execute
(0 millisecs, that
> is). The third line takes a non-zero amount of time to
execute, however.
>

The second and third line are not identical. The second line
just
creates an expression template which is never really
evaluated.
The third line forces the evaluation because of the
assignment. Thus
the difference in execution times.

-Vardan
_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

Re: prod() performance question
country flaguser name
Germany
2007-03-22 15:37:23
Gross, Steve schrieb:
>
>  
>
> Ublas::matrix<double> m1(100, 100), C(100, 100);
>

> Ublas::Prod(m1, m1)
>
This only creates a small object that contains information
about the 
parameters and the requested operation. It does not compute
anything
>
> c = ublas::prod(m1, m1)
>

the assignment executes the stored operation and thus takes
some time. 
This behavior is the key idea of expression templates:
collect all 
information about the expression to be computed and defer
the execution 
as much as possible (here until assignment). So the
assignment can 
(automatically) choose the best algorithm and the
computation is only 
done for the part of the results which is actually used.

mfg
Gunter

_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

Re: prod() performance question
country flaguser name
Germany
2007-03-23 11:28:24
Am Freitag, 23. März 2007 17:01 schrieb dan elliott:

> It is possible and, if so, how far away is ublas from
being able to
> seemlessly switch between ublas/ATLAS/ESSL/etc
implementations like
> we can do with BLAS?

This is a good question, but hard to answer. IMO uBLAS is
still far away 
from this point. This was one of the reasons of starting the
glas 
project which is able to switch between different
implementation. In 
order to enable ublas to automatically map some BLAS2/3
expression to 
the corresponding routine one could overload the various
assign() 
member functions of matrix and vector types. I think of
something like

matrix<T>::plus_assign( matrix_matrix_prod<M1, M2,
T>& e ) {
  #ifndef USE_ATLAS
    matrix_assign< ... >( ... ); 
  #else
    atlas::gemm( 1.0, e().op1(), e.op2(), 1.0, *this );
  #endif
}

However this would affect a lot of (internal) uBLAS classes
and cause a 
lot of headache because the partial template ordering has to
be 
convinced to choose the correct specialization ... Maybe
this is now 
easier with enable_if<> ...

mfg
Gunter

_______________________________________________
ublas mailing list
ublaslists.boost.org
htt
p://lists.boost.org/mailman/listinfo.cgi/ublas

[1-8]

about | contact  Other archives ( Real Estate discussion Medical topics )