Hi Dave
> I'll help if you want osme? Might be an idea to
use it with
> http://me
rcury.ccil.org/~cowan/XML/tagsoup/ tagsoup since most
> html isn't all that clever?
Hmmm... Maybe. But I don't see that this is ever going to
be a
generalized converter from HTML to OO. I see it as a step in
a
specific pipeline requiring quite good HTML. A very
adaptable
generalized converter will need mapping support between HTML
and OO
and that will be complicated. More complicated than I want
anyway.
For example, I am publishing my CV like this. I maintain the
CV in
Emacs org-mode. From there I generate an XOXO microformat
file which I
then XSLT into well marked up HTML (with DIVs and things).
I can then use html2oo.xslt to tranfer that into OO and from
there get
Word or anything else that OO can spit out.
Another example of an application I had in mind is something
I built
for Thompson: it built websites out of legal content by
converting
their SGML content to XML and then HTML via XSLT. I also had
to
convert the XML to Word by using an XSL-FO processor.
But now I would just have a single HTML design with a CSS
providing
the look for the web pages and html2oo.xslt producing Word
(via
OpenOffice).
Anyway... I've inlined the stylesheet at the bottom. As I
said, it's
not comprehensive yet but as I need more elements I will add
them.
Right now, I'm controlling the resulting OO file with a
Makefile that
looks like this:
doc.odt: doc/content.xml
bash -c 'cd doc ; zip -r ../doc.odt *'
doc/content.xml: html2oo.xslt doc.html doc
xsltproc --html html2oo.xslt doc.html > doc/content.xml
doc:
[ -d doc ] || ( mkdir doc ; unzip -d doc doc.odt )
There are options for making this better but it kinda
depends on what
tools you want to use for the XSLT.
If I setup a darcs (http://abridgegame.org/
darcs/) repository for this
would anyone contribute do you think? Would you?
<?xml version="1.0"
encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns sl=&
quot;http://www.w3.or
g/1999/XSL/Transform"
xmlns:office="urn:oasis:names:tc:opendocument mlns:
office:1.0"
xmlns:style="urn:oasis:names:tc:opendocument mlns:
style:1.0"
xmlns:text="urn:oasis:names:tc:opendocument mlns:
text:1.0"
xmlns:table="urn:oasis:names:tc:opendocument mlns:
table:1.0"
xmlns:draw="urn:oasis:names:tc:opendocument mlns:
drawing:1.0"
xmlns:fo="urn:oasis:names:tc:opendocument mlns<
img
src='http://www.archivesat.com/images/love_struck.gif'>sl-fo
-compatible:1.0"
xmlns link=
"http://www.w3.org/1999/x
link"
xmlns:dc="http://purl.org/dc/e
lements/1.1/"
xmlns:meta="urn:oasis:names:tc:opendocument mlns:
meta:1.0"
xmlns:number="urn:oasis:names:tc:opendocument mlns:
datastyle:1.0"
xmlns:svg="urn:oasis:names:tc:opendocument mlns:
svg-compatible:1.0"
xmlns:chart="urn:oasis:names:tc:opendocument mlns:
chart:1.0"
xmlns:dr3d="urn:oasis:names:tc:opendocument mlns:
dr3d:1.0"
xmlns:math="http://www.w3.org/
1998/Math/MathML"
xmlns:form="urn:oasis:names:tc:opendocument mlns:
form:1.0"
xmlns:script="urn:oasis:names:tc:opendocument mlns:
script:1.0"
xmlns:ooo="http://openoffice.o
rg/2004/office"
xmlns:ooow="http://openoffice.o
rg/2004/writer"
xmlns:oooc="http://openoffice.org
/2004/calc"
xmlns:dom="http://www.w3.org/2
001/xml-events"
xmlns forms
="http://www.w3.org/2002/
xforms"
xmlns sd=&
quot;http://www.w3.org/20
01/XMLSchema"
xmlns si=&
quot;http://www.
w3.org/2001/XMLSchema-instance">
<!--
Copyright (C) 2006 by Tapsell-Ferrier Limited
This program is free software; you can redistribute
it and/or modify
it under the terms of the GNU General Public
License as published by
the Free Software Foundation; either version 2, or
(at your option)
any later version.
This program is distributed in the hope that it
will be useful,
but WITHOUT ANY WARRANTY; without even the implied
warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General
Public License
along with this program; see the file COPYING. If
not, write to the
Free Software Foundation, Inc., 51 Franklin
Street, Fifth Floor,
Boston, MA 02110-1301 USA
-->
<xsl:output method="xml"
indent="yes"/>
<xsl:template match="/html">
<office:document-content
office:version="1.0">
<office:scripts/>
<office:font-face-decls>
<style:font-face
style:name="StarSymbol"
svg:font-family="StarSymbol"
style:font-charset="x-symbol"/>
<style:font-face style:name="DejaVu
Sans1" svg:font-family="'DejaVu Sans'"
style:font-pitch="variable"/>
<style:font-face style:name="DejaVu
Serif" svg:font-family="'DejaVu Serif'"
style:font-family-generic="roman"
style:font-pitch="variable"/>
<style:font-face style:name="DejaVu
Sans" svg:font-family="'DejaVu Sans'"
style:font-family-generic="swiss"
style:font-pitch="variable"/>
</office:font-face-decls>
<office:automatic-styles>
<style:style
style:name="Table1"
style:family="table">
<style:table-properties
style:width="6.925in"
table:align="margins"/>
</style:style>
<style:style
style:name="Table1.A"
style:family="table-column">
<style:table-column-properties
style:column-width="3.4625in"
style:rel-column-width="32767*"/>
</style:style>
<style:style
style:name="Table1.A1"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in"
fo:border-left="0.0007in solid #000000"
fo:border-right="none"
fo:border-top="0.0007in solid #000000"
fo:border-bottom="0.0007in solid #000000"/>
</style:style>
<style:style
style:name="Table1.B1"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in" fo:border="0.0007in
solid #000000"/>
</style:style>
<style:style
style:name="Table1.A2"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in"
fo:border-left="0.0007in solid #000000"
fo:border-right="none"
fo:border-top="none"
fo:border-bottom="0.0007in solid #000000"/>
</style:style>
<style:style
style:name="Table1.B2"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in"
fo:border-left="0.0007in solid #000000"
fo:border-right="0.0007in solid #000000"
fo:border-top="none"
fo:border-bottom="0.0007in solid #000000"/>
</style:style>
<style:style
style:name="Table2"
style:family="table">
<style:table-properties
style:width="6.925in"
table:align="margins"/>
</style:style>
<style:style
style:name="Table2.A"
style:family="table-column">
<style:table-column-properties
style:column-width="3.4625in"
style:rel-column-width="32767*"/>
</style:style>
<style:style
style:name="Table2.A1"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in"
fo:border-left="0.0007in solid #000000"
fo:border-right="none"
fo:border-top="0.0007in solid #000000"
fo:border-bottom="0.0007in solid #000000"/>
</style:style>
<style:style
style:name="Table2.B1"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in" fo:border="0.0007in
solid #000000"/>
</style:style>
<style:style
style:name="Table2.A2"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in"
fo:border-left="0.0007in solid #000000"
fo:border-right="none"
fo:border-top="none"
fo:border-bottom="0.0007in solid #000000"/>
</style:style>
<style:style
style:name="Table2.B2"
style:family="table-cell">
<style:table-cell-properties
fo:padding="0.0382in"
fo:border-left="0.0007in solid #000000"
fo:border-right="0.0007in solid #000000"
fo:border-top="none"
fo:border-bottom="0.0007in solid #000000"/>
</style:style>
<style:style style:name="P1"
style:family="paragraph"
style:parent-style-name="Table_20_Heading">
<style:paragraph-properties
fo:text-align="start"
style:justify-single-word="false"/>
<style:text-properties
fo:font-style="normal"
fo:font-weight="normal"
style:font-style-asian="normal"
style:font-weight-asian="normal"
style:font-style-complex="normal"
style:font-weight-complex="normal"/>
</style:style>
<style:style style:name="P2"
style:family="paragraph"
style:parent-style-name="Standard"
style:list-style-name="L1"/>
<style:style style:name="P3"
style:family="paragraph"
style:parent-style-name="Standard"
style:list-style-name="L2"/>
<text:list-style
style:name="L1">
<text:list-level-style-bullet
text:level="1"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="0.25in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="2"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="○">
<style:list-level-properties
text:space-before="0.5in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="3"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="■">
<style:list-level-properties
text:space-before="0.75in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="4"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="1in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="5"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="○">
<style:list-level-properties
text:space-before="1.25in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="6"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="■">
<style:list-level-properties
text:space-before="1.5in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="7"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="1.75in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="8"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="○">
<style:list-level-properties
text:space-before="2in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="9"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="■">
<style:list-level-properties
text:space-before="2.25in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="10"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="2.5in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
</text:list-style>
<text:list-style
style:name="L2">
<text:list-level-style-bullet
text:level="1"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="0.25in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="2"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="○">
<style:list-level-properties
text:space-before="0.5in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="3"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="■">
<style:list-level-properties
text:space-before="0.75in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="4"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="1in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="5"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="○">
<style:list-level-properties
text:space-before="1.25in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="6"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="■">
<style:list-level-properties
text:space-before="1.5in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="7"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="1.75in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="8"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="○">
<style:list-level-properties
text:space-before="2in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="9"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="■">
<style:list-level-properties
text:space-before="2.25in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
<text:list-level-style-bullet
text:level="10"
text:style-name="Bullet_20_Symbols"
style:num-suffix="."
text:bullet-char="●">
<style:list-level-properties
text:space-before="2.5in"
text:min-label-width="0.25in"/>
<style:text-properties
style:font-name="StarSymbol"/>
</text:list-level-style-bullet>
</text:list-style>
</office:automatic-styles>
<xsl:apply-templates
select="body"/>
</office:document-content>
</xsl:template>
<xsl:template match="body">
<office:body>
<office:text>
<office:forms
form:automatic-focus="false"
form:apply-design-mode="false"/>
<text:sequence-decls>
<text:sequence-decl
text:display-outline-level="0"
text:name="Illustration"/>
<text:sequence-decl
text:display-outline-level="0"
text:name="Table"/>
<text:sequence-decl
text:display-outline-level="0"
text:name="Text"/>
<text:sequence-decl
text:display-outline-level="0"
text:name="Drawing"/>
</text:sequence-decls>
<xsl:apply-templates
select="node()"/>
</office:text>
</office:body>
</xsl:template>
<xsl:template match="h1">
<text:h
text:style-name="Heading_20_1"><xsl:apply-
templates select="node()"/></text:h>
</xsl:template>
<xsl:template match="h2">
<text:h
text:style-name="Heading_20_2"><xsl:apply-
templates select="node()"/></text:h>
</xsl:template>
<xsl:template match="h3">
<text:h
text:style-name="Heading_20_3"><xsl:apply-
templates select="node()"/></text:h>
</xsl:template>
<xsl:template match="h4">
<text:h
text:style-name="Heading_20_4"><xsl:apply-
templates select="node()"/></text:h>
</xsl:template>
<xsl:template match="p">
<text:p
text:style-name="Standard"><xsl:apply-temp
lates select="node()"/></text:p>
</xsl:template>
<xsl:template match="table">
<table:table table:name="Table1"
table:style-name="Table1">
<table:table-column
table:style-name="Table1.A"
table:number-columns-repeated="2"/>
<!-- FIXME: should not do this...
instead simply apply on node() and have
template matches for tr[th] -->
<xsl:for-each
select="tr[th]">
<table:table-header-rows>
<table:table-row>
<xsl:apply-templates
select="th|td"/>
</table:table-row>
</table:table-header-rows>
</xsl:for-each>
<xsl:for-each
select="tr[td]">
<table:table-row>
<xsl:apply-templates
select="td"/>
</table:table-row>
</xsl:for-each>
</table:table>
</xsl:template>
<xsl:template match="th|td">
<table:table-cell
table:style-name="Table1.A1"
office:value-type="string">
<xsl:call-template
name="text_applyer"/>
</table:table-cell>
</xsl:template>
<xsl:template match="ul">
<text:list text:style-name="L1">
<!-- FIXME: should not do this...
instead simply apply on node() and have
template matches for li -->
<xsl:for-each select="li">
<text:list-item><xsl:call-template
name="text_applyer"/></text:list-item>
</xsl:for-each>
</text:list>
</xsl:template>
<xsl:template name="text_applyer">
<xsl:choose>
<xsl:when
test="text()"><text:p
text:style-name="Standard"><xsl:value-of
select="."/></text:p>
</xsl:when>
<xsl:otherwise><xsl:apply-templates
select="node()"/></xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="p">
<text:p
text:style-name="Standard"><xsl:apply-temp
lates select="node()"/></text:p>
<text:p
text:style-name="Standard"></text:p>
</xsl:template>
</xsl:stylesheet>
--
Nic Ferrier
http://www.tapsellfer
rier.co.uk for all your tapsell ferrier needs
------------------------------------------------------------
---------
To unsubscribe, e-mail: dev-unsubscribe xml.openoffice.org
For additional commands, e-mail: dev-help xml.openoffice.org
|