Hi Cedric,
Thanks for the message. Using an in memory DOM, especially
for large
spreadsheets is indeed not optimal for large datasets. One
thing you
could do is to use a TableModel implementation, that does
not store all
the data, but reads the values from your CSV file as needed.
This would
get rid if the duplication but we'd still have the in-memory
DOM
representation of the document.
A better solution might be to use a more efficient storage
model for the
SpreadsheetDocument class. This can be done, when higher
level
abstractions for the various document types are introduced
into the
library. Right now, the abstraction is mostly the DOM
representation of
the document, but this is just intended to be the first
step.
Cheers,
Lars
Cedric Bosdonnat wrote:
> Hi all,
>
> I've tried to use odf4J to generate calc files from a
CSV. I've seen
> that the program needs more memory than the default
Java Heap memory
> available for 8700 lines / 8 columns.
>
>
> I think that there are a few things that could be
improved here:
> + The use of the TableModel forces to have at least 2
times the data
> size in memory: once in the TableModel and once in the
DOM tree
> + The use of a DOM tree could be good for small
files, but it would be
> interesting to allow the use of SAX or StAX APIs to
write the content.
>
> I'll probably have no time to make the changes, but I
hope this could
> help further developments.
>
> Cedric
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: dev-unsubscribe odftoolkit.openoffice.org
> For additional commands, e-mail: dev-help odftoolkit.openoffice.org
>
--
Lars Oppermann <lars.oppermann sun.com>
Sun Microsystems
Software Engineer
Nagelsweg 55
Phone: +49 40 23646 959 20097
Hamburg, Germany
Fax: +49 40 23646 550 http://www.sun.com/star
office
------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribe odftoolkit.openoffice.org
For additional commands, e-mail: dev-help odftoolkit.openoffice.org
|