|
List Info
Thread: Problems using XML::LibXML and XML::LibXSLT; getting corrupted output.
|
|
| Problems using XML::LibXML and
XML::LibXSLT; getting corrupted output. |

|
2007-10-12 17:33:54 |
I’m trying to write a script using XML::LibXML and
XML::LibXSTL and I’m
getting corrupted data output. I’ve stripped my script down
to the bare
minimum to reproduce this problem. Basically the stripped
down script
will take 2 XML files, use XSLT to remove the root (and some
other stuff
in the final script), then merge them into a single document
under a
different root. The parsing and transformation will
ultimately be done
in a separate module, but not until I figure out what I’m
doing wrong.
So if document 1 looks like:
<?xml version="1.0"
encoding="UTF-8"?>
<root>
<element1>
<element1a>text1a</element1a>
<element1b>text1b</element1b>
</element1>
</root>
And document 2 looks like:
<?xml version="1.0"
encoding="UTF-8"?>
<root>
<element2>
<element2a>text2a</element2a>
</element2>
</root>
The result should be:
<?xml version="1.0"
encoding="UTF-8"?>
<new-root>
<element1>
<element1a>text1a</element1a>
</element 1>
<element 2>
<element2a>text2a</element2a>
<element2b>text2b</element2b>
</element2>
</new-root >
Here’s the script.
#!/usr/bin/perl -w
use strict;
use XML::LibXML;
use XML::LibXSLT;
my $parser = XML::LibXML->new();
my $xslt = XML::LibXSLT->new();
my $doc = XML::LibXML: ocument-
>new();
my $root = $doc->createElement('newroot');
$doc->setDocumentElement($root);
foreach (qw(file1.xml file2.xml)) {
$root->appendChild(get_doc($_));
}
print "** final result **n", $doc->toString,
"n";
sub get_doc {
my $filename = shift;
my $template =
$xslt->parse_stylesheet_file('template.xsl');
my $doc = $template->transform_file($filename);
my $element = $doc->documentElement();
print "** transform results **n",
$element->toString(), "n";
return $element;
}
The result of the transform_file is what I expect, but the
final
document is often not even well formed XML, and sometimes
even includes
little smiley faces, and other non-text characters. In some
cases I’ve
seen bits of strings from the template file, which makes me
think I'm
doing something wrong with LibXSLT.
If, instead of transforming the file with XSLT, I just parse
it, get the
root element and return that, then everything seems to work
fine, Which
again points to LibXSLT.
If anybody could tell me what I’m doing wrong, I would
really appreciate
it. Also since I’m relatively new to the XML modules, any
specific
suggestions for fixes would also be appreciated.
TIA
-Tim Fletcher
Just in case it matters, here is the template.
<?xml version="1.0"
encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns sl=&q
uot;http:/
/www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="root">
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match="*">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
_______________________________________________
Perl-XML mailing list
Perl-XML listserv.ActiveState.com
To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs
|
|
| Re: Problems using XML::LibXML and
XML::LibXSLT; getting corrupted output. |

|
2007-10-12 19:03:18 |
|
Tim,
There are two issues with your current code:
- When taking XML nodes from one XML document and placing them in
another, the latter must explicitly either "import" or "adopt" each
node, depending on whether the nodes should be removed from the
original doc or not, respectively (see documentation of importNode and
adoptNode for XML::LibXML:
ocument object).
- If a source file has more than one node under the root, it will
cause problems when you try to use XSLT to remove the root element and
append the results to the new document because the results from the
transform will not be a well-formed XML document. Therefore, it would
be better to avoid XSLT all together and stick with DOM+XPath.
Revised code:
#!/usr/bin/perl -w
use strict;
use XML::LibXML;
use XML::LibXSLT;
my $parser = XML::LibXML->new();
my $xslt = XML::LibXSLT->new();
my $doc = XML::LibXML: ocument->new();
my $root = $doc->createElement('newroot');
$doc->setDocumentElement($root);
foreach (qw(file1.xml file2.xml)) {
foreach (get_toplevel_elems($_))
{
$doc->importNode($_); # need to associate node w/ new doc
$root->appendChild($_);
}
}
print "** final result **n", $doc->toString, "nn";
sub get_toplevel_elems {
my $filename = shift;
my $doc = $parser->parse_file($filename);
my elements = $doc->documentElement()->childNodes; # gets all
child nodes, including comments, whitespace, etc.
#my elements =
$doc->documentElement()->find("*")->get_nodelist(); # gets
just the elements and converts the resulting XML::LibXML::NodeList to
perl list
return elements;
}
HTH,
Richard
Tim Fletcher wrote:
tim.fletchmail.net" type="cite">
I’m trying to write a script using XML::LibXML and XML::LibXSTL and I’m
getting corrupted data output. I’ve stripped my script down to the bare
minimum to reproduce this problem. Basically the stripped down script
will take 2 XML files, use XSLT to remove the root (and some other stuff
in the final script), then merge them into a single document under a
different root. The parsing and transformation will ultimately be done
in a separate module, but not until I figure out what I’m doing wrong.
So if document 1 looks like:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element1>
<element1a>text1a</element1a>
<element1b>text1b</element1b>
</element1>
</root>
And document 2 looks like:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element2>
<element2a>text2a</element2a>
</element2>
</root>
The result should be:
<?xml version="1.0" encoding="UTF-8"?>
<new-root>
<element1>
<element1a>text1a</element1a>
</element 1>
<element 2>
<element2a>text2a</element2a>
<element2b>text2b</element2b>
</element2>
</new-root >
Here’s the script.
#!/usr/bin/perl -w
use strict;
use XML::LibXML;
use XML::LibXSLT;
my $parser = XML::LibXML->new();
my $xslt = XML::LibXSLT->new();
my $doc = XML::LibXML: ocument->new();
my $root = $doc->createElement('newroot');
$doc->setDocumentElement($root);
foreach (qw(file1.xml file2.xml)) {
$root->appendChild(get_doc($_));
}
print "** final result **n", $doc->toString, "n";
sub get_doc {
my $filename = shift;
my $template = $xslt->parse_stylesheet_file('template.xsl');
my $doc = $template->transform_file($filename);
my $element = $doc->documentElement();
print "** transform results **n", $element->toString(), "n";
return $element;
}
The result of the transform_file is what I expect, but the final
document is often not even well formed XML, and sometimes even includes
little smiley faces, and other non-text characters. In some cases I’ve
seen bits of strings from the template file, which makes me think I'm
doing something wrong with LibXSLT.
If, instead of transforming the file with XSLT, I just parse it, get the
root element and return that, then everything seems to work fine, Which
again points to LibXSLT.
If anybody could tell me what I’m doing wrong, I would really appreciate
it. Also since I’m relatively new to the XML modules, any specific
suggestions for fixes would also be appreciated.
TIA
-Tim Fletcher
Just in case it matters, here is the template.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns sl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"/>
<xsl:template match="root">
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match="*">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
_______________________________________________
Perl-XML mailing list
listserv.ActiveState.com">Perl-XML listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
|
[1-2]
|
|