List Info

Thread: odf4J performances




odf4J performances
user name
2007-03-12 15:24:48
Hi all,

I've tried to use odf4J to generate calc files from a CSV.
I've seen
that the program needs more memory than the default Java
Heap memory
available for 8700 lines / 8 columns.


I think that there are a few things that could be improved
here:
  + The use of the TableModel forces to have at least 2
times the data
size in memory: once in the TableModel and once in the DOM
tree
  + The use of a DOM tree could be good for small files, but
it would be
interesting to allow the use of SAX or StAX APIs to write
the content.

I'll probably have no time to make the changes, but I hope
this could
help further developments.

Cedric

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribeodftoolkit.openoffice.org
For additional commands, e-mail: dev-helpodftoolkit.openoffice.org


Re: odf4J performances
user name
2007-03-13 05:12:07
Hi Cedric,

Thanks for the message. Using an in memory DOM, especially
for large 
spreadsheets is indeed not optimal for large datasets. One
thing you 
could do is to use a TableModel implementation, that does
not store all 
the data, but reads the values from your CSV file as needed.
This would 
get rid if the duplication but we'd still have the in-memory
DOM 
representation of the document.

A better solution might be to use a more efficient storage
model for the 
SpreadsheetDocument class. This can be done, when higher
level 
abstractions for the various document types are introduced
into the 
library. Right now, the abstraction is mostly the DOM
representation of 
the document, but this is just intended to be the first
step.

Cheers,
Lars

Cedric Bosdonnat wrote:
> Hi all,
> 
> I've tried to use odf4J to generate calc files from a
CSV. I've seen
> that the program needs more memory than the default
Java Heap memory
> available for 8700 lines / 8 columns.
> 
> 
> I think that there are a few things that could be
improved here:
>   + The use of the TableModel forces to have at least 2
times the data
> size in memory: once in the TableModel and once in the
DOM tree
>   + The use of a DOM tree could be good for small
files, but it would be
> interesting to allow the use of SAX or StAX APIs to
write the content.
> 
> I'll probably have no time to make the changes, but I
hope this could
> help further developments.
> 
> Cedric
> 
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: dev-unsubscribeodftoolkit.openoffice.org
> For additional commands, e-mail: dev-helpodftoolkit.openoffice.org
> 


-- 
Lars Oppermann <lars.oppermannsun.com>              
Sun Microsystems
Software Engineer                                        
Nagelsweg 55
Phone: +49 40 23646 959                         20097
Hamburg, Germany
Fax:   +49 40 23646 550                  http://www.sun.com/star
office

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribeodftoolkit.openoffice.org
For additional commands, e-mail: dev-helpodftoolkit.openoffice.org


Re: odf4J performances
user name
2007-03-13 08:54:46
Cedric Bosdonnat wrote:
> Hi all,
> 
> I've tried to use odf4J to generate calc files from a
CSV. I've seen
> that the program needs more memory than the default
Java Heap memory
> available for 8700 lines / 8 columns.
> 
> 
> I think that there are a few things that could be
improved here:
>   + The use of the TableModel forces to have at least 2
times the data
> size in memory: once in the TableModel and once in the
DOM tree

Well agreed that CSV is so common that there should be some

functionality to import CSV without needing to create a
TableModel before.

>   + The use of a DOM tree could be good for small
files, but it would be
> interesting to allow the use of SAX or StAX APIs to
write the content.
> 

Thatīs pretty much already there.

Look at the parse methods of ODFXMLHelper on how to generate
SAX Events 
from content and at OdtToText.java for an example doing
this.

To store new content back into the package in a SAX based
application 
just call getOutputStream() on the package.


> I'll probably have no time to make the changes, but I
hope this could
> help further developments.
> 
> Cedric
> 
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: dev-unsubscribeodftoolkit.openoffice.org
> For additional commands, e-mail: dev-helpodftoolkit.openoffice.org
> 

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribeodftoolkit.openoffice.org
For additional commands, e-mail: dev-helpodftoolkit.openoffice.org


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )