[Petal] <![CDATA[ ... ]]> and HTML Elements

Fergal Daly fergal at esatclear.ie
Mon Nov 8 17:00:34 GMT 2004


On Mon, Nov 08, 2004 at 11:16:08AM -0500, William McKee wrote:
> On Mon, Nov 08, 2004 at 12:50:59PM +0000, Chris Croome wrote:
> > This has come up before with respect to javascript, what should
> > petal do, not touch things in CDATA sections?
> 
> I think this would be a good solution.

Petal's current behaviour is correct.

Text that comes from CDATA should not be treated any differently than normal
text. This XML doc

<tag><![CDATA[a & b]]></tag>

is exactly the same as this one

<tag>a &amp; b</tag>

They have different representations but they have the same meaning. If you
put the first one into this tool

http://soapclient.com/XMLCanon.html

it will output the second one.

Any application that treats one differently from the other is broken.

Here's a script that parses those 2 documents using XML::Parser and output's
the parsed tree. As you can see, the parsed tree for each one is identical.

########
use strict;
use warnings;

use XML::Parser;
use Data::Dumper;

my $parser = XML::Parser->new(Style => 'Tree');

my $cdata = '<tag><![CDATA[a & b]]></tag>';
my $nocdata = '<tag>a &amp; b</tag>';

for my $doc ($cdata, $nocdata)
{
  print Dumper($parser->parse($doc))."\n\n";
}
  


More information about the Petal mailing list