[Petal] <![CDATA[ ... ]]> and HTML Elements
Fergal Daly
fergal at esatclear.ie
Mon Nov 8 17:00:34 GMT 2004
On Mon, Nov 08, 2004 at 11:16:08AM -0500, William McKee wrote:
> On Mon, Nov 08, 2004 at 12:50:59PM +0000, Chris Croome wrote:
> > This has come up before with respect to javascript, what should
> > petal do, not touch things in CDATA sections?
>
> I think this would be a good solution.
Petal's current behaviour is correct.
Text that comes from CDATA should not be treated any differently than normal
text. This XML doc
<tag><![CDATA[a & b]]></tag>
is exactly the same as this one
<tag>a & b</tag>
They have different representations but they have the same meaning. If you
put the first one into this tool
http://soapclient.com/XMLCanon.html
it will output the second one.
Any application that treats one differently from the other is broken.
Here's a script that parses those 2 documents using XML::Parser and output's
the parsed tree. As you can see, the parsed tree for each one is identical.
########
use strict;
use warnings;
use XML::Parser;
use Data::Dumper;
my $parser = XML::Parser->new(Style => 'Tree');
my $cdata = '<tag><![CDATA[a & b]]></tag>';
my $nocdata = '<tag>a & b</tag>';
for my $doc ($cdata, $nocdata)
{
print Dumper($parser->parse($doc))."\n\n";
}
More information about the Petal
mailing list