List Info

Thread: Proposal of XML related tasks on OpenOffice.org to the chinese RedOffice team




Proposal of XML related tasks on OpenOffice.org to the chinese RedOffice team
user name
2006-08-17 11:27:48
Hi Xiuzhi,

I am very happy that you and your team from 'RedOffice'
have decided to
join the OpenOffice.org community. I am even more excited,
that you wish
to contribute work in the XML based filter area.
As we discussed earlier, I am going to give all information
necessary
to work in this area publicly on our xml-dev list, to make
it
possible for anyone of the community to jump in.

In case you have not decided yet on which exact field or
problem you and
your team are going to work, I am going to point out some
improvable
areas, where your help would be most welcome, as I am
running out of time:

  1. Filter detection:
     Every time a document is being imported in StarOffice /
     OpenOffice.org the filter detection chooses between the
existing
     filters for the most appropriate filter.
     Currently the correct XML based filter is being chosen
by the
     DocType string, which is being provided in the XML
Filter Settings
     dialog. Trying to find the string in the first 1000
char.
     Much better would be a filter detection based on the
XML root node
     and XML Namespace. Other possible document loading
scenarios have
     to be evaluated.

  2. Storage of embedded Document content:
     Saving embedded content of an Office document:
     E.g. Graphics might be unpacked as a folder (similar to
browser
     behavior, e.g. as in FireFox).

  3. Logging:
     Currently Logging is comparable weak for the XML based
/ XSLT
     filters. The only way to enable logging is to set a
Java
     environment variable (e.g.
    
-DXSLTransformer.statsfile=/usr/local/offices/xslt_debug.txt
) in
     the Office options for Java.
     A few new features are imaginable as:
         * Customizing the filter logging via GUI
         * Usage of defined log level (analog to Java
Logging)
         * GUI flag for a cumulative log file (instead
replacing
           log for every transformation) Required for
logging test
           scenarios transforming multiple files.

  4. Validation:
     Validation should be used during import/export.
     Currently validation is only possible from a test
dialog. In case
     of an export filter, it will be validated against an
user provided
     DTD, or in case of an import filter against the already
bundled
     OpenOffice.org XML DTD [1].
     This existing test scenario might be dropped in favor
of external
     development tools. Instead an (optional) validation
during runtime
     should be possible to allow a variation of user
scenarios:
         * Turn on schema validation for larger
customer/field test
           like for a StarOffice Beta release
         * Schema validation against a subset of an existing
schema
           (e.g. more restricted Open Document format). To
be used to
           control the validity of the input document of the
filter by
           proof of existence of certain data or structure.
For
           example, a restriction on the styles being used
or to check
           if the document being processed satisfy a
demanded structure
           as for a certain legal document.

     To establish this, it is not sufficient to reuse the
existing DTD
     functionality, but expand it at least against the
schema of the
     OD default format (Relax NG). To be more flexible to
the market
     arbitrary schemas as DTD, XML Schema, Relax NG should
be usable by
     using a conversion tool, making them compatible to one
another.
     The MSV (Multi Schema Validator) would be an option,
     https://msv.dev.java.net/
     As even the most powerful schema (Relax NG) has it's
limitation,
     it might be desirable to use the ISO standard XML
Schematron, too
     It basically depends on the usage of assertion on
certain
     document content (pointed out by XPath).
     By this the user has a validation against arbitrary
even most
     complex business logic no other schema would be able to
manage.

Are you interested in one of these areas or have possibly
found another
one you would like to work on?

Maybe you found constraints during your work on the XSLT
filter
transforming the UOF format to OpenDocument and now want to
solve them?

Or possibly you are seeking a real challenge? For example,
most advanced
would be the redesign of the XML FILTER SETTINGS dialog as
GUI
implementations are involved.

I am looking forward to your answer.

Kind Regards,
Svante


[1] DTD was the schema earlier used for the StarOffice 7
format 
(OpenOffice.org XML) the new XML format for StarOffice 8
(Open Document 
format) is based instead on the more powerful RELAX NG
schema.


-- 
Svante Schubert <svante.schubertsun.com>             Sun
Microsystems
Software Engineer - StarOffice                           
Nagelsweg 55
Phone:  +49 40 23646 965                              
D-20097 Hamburg
Fax:    +49 40 23646 550                 http://www.sun.com/star
office

------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribexml.openoffice.org
For additional commands, e-mail: dev-helpxml.openoffice.org

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )