List Info

Thread: Is the flat XML format the best approach for XSLT? (earlier: office:document-content vs of




Is the flat XML format the best approach for XSLT? (earlier: office:document-content vs of
user name
2007-10-18 05:22:11
The reason why the XSLT processor requires office:document
and not 
office:document-content is that it expects to receive one
single 
unzipped XML stream/file, where at least content.xml,
meta.xml and 
styles.xml are being merged to one, often called the
"flat XML file".
Even embedded pictures of the ODF document are being merged
into this 
XML file Base64 encoded.


Is this the best way to do it?

It was very tempting for a XSLT developer to have all
sources in one 
input file, but now after gaining some experience I think
the drawbacks 
weight more.

Most annoying to me the stream contains resources, which are
not 
required by most transformations.

For instance, I never wrote a stylesheet, where the XSLT
processor ever 
used the encoded images in the flat stream.

Images are being encoded to base64, size is being extended
by 33%, 
processed by the XML parser, XSLT processor and then being
neglected.
Seems like we are wasting resources here..

An optimized transformation would only choose the streams of
the 
package, which are important to process.

A much better approach to me would be if the transformation
would
process the manifest via XSLT and choose the desired streams
among all 
possible streams.
These package streams could be accessed via the XSLT
document()function 
and a Office Handler resolving these calls.

An additional problem that would be solved by this approach,
is the 
processing of user XML in the package. Remember anyone could
add streams 
to the package as long the stream is listed in the
manifest.
Currently it is not specified nor implemented how to handle
all user 
streams into the flat XML by a generic approach.

Considering the earlier mentioned waste of resources due to
unnecessary 
encoding&parsing this seems to me the wrong approach
anyway.

Finally if an XSL transformation is based not on a flat xml
format, but 
on the package format, the similar transformation can be
easily be used 
outside of the office, for instance as part of a browser
extension.

Anyone here who would be able and interested to make such an
improvement 
come alive?

Svante

ashok _ wrote:
> I am transforming an ODT file using two different
mechanisms, one,
> using the UNO XSLT mechanism, and the other using a
standalone XSLT
> processor.
> 
> Why does the UNO XSLT processor require the content
root to be
> <office:document>,
> while if I view content.xml it is actually
<office:document-content> ?
> 
> ashok
> 
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: dev-unsubscribexml.openoffice.org
> For additional commands, e-mail: dev-helpxml.openoffice.org
> 

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribexml.openoffice.org
For additional commands, e-mail: dev-helpxml.openoffice.org


Re: Is the flat XML format the best approach for XSLT? (earlier: office:document-content v
user name
2007-10-18 05:44:38
On 18/10/2007, Svante Schubert <Svante.Schubertsun.com> wrote:
> The reason why the XSLT processor requires
office:document and not
> office:document-content is that it expects to receive
one single
> unzipped XML stream/file, where at least content.xml,
meta.xml and
> styles.xml are being merged to one, often called the
"flat XML file".
> Even embedded pictures of the ODF document are being
merged into this
> XML file Base64 encoded.

Which does sound very wasteful. Especially for large
documents
with many images?


>
>
> Is this the best way to do it?

Best for what 



> Finally if an XSL transformation is based not on a flat
xml format, but
> on the package format, the similar transformation can
be easily be used
> outside of the office, for instance as part of a
browser extension.
>
> Anyone here who would be able and interested to make
such an improvement
> come alive?

Not without some agreement that this approach is supported
by ODF.
It's hardly a lot of work though?

I have code that deals with an unzipped ODF document, which
I guess
is just the sort of thing that you're referring to?

Where is the manifest defined please? I'm assuming it's an
XML format.

regards




-- 
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk


------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribexml.openoffice.org
For additional commands, e-mail: dev-helpxml.openoffice.org


[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )