[Petal] Problems with HTML Parser

William McKee william@knowmad.com
Wed, 7 Aug 2002 23:10:30 -0400


Jean-Michel,

More problems with HTML Tidy and the HTML Parser in Petal. I tidied a file 
with the -clean option which generated bunches of styles in place of font 
tags. That's a nice feature. However, when I run the resulting template 
through Petal, the HTML parser is changing quotes to encoded values which 
are not valid CSS.

For example, if my CSS def'n is as follows:

            .head {
                font-size: 1.1em;
                font-weight: bold;
                font-family: "arial";
            }


When Petal interprets it, I get:

            .head {
                font-size: 1.1em;
                font-weight: bold;
                font-family: "arial";
            }

Is this a problem with HTML::TreeBuilder or Petal? Also, my link tag to 
include an external CSS stylesheet is being transformed from:

    <link href="../qns.css" type="text/css" rel="stylesheet" />

to:
    <link href="../qns.css" type="text/css" rel="stylesheet"></link>

Somehow this is upsetting my browser (Opera 6.04/Win) because when I 
remove the line in the template, the page is displayed immediately (using 
mod_perl). When the line is there, it takes 15 seconds for the page to 
finally get displayed. Weird behavior that seems to stem from this 
unexpected </link> tag that somehow is being added when Petal runs. 

I do not experience this behavior in IE6 so perhaps it's an Opera bug if 
that's a valid tagset. However, I've never seen any docs which recommend 
using a closing tag to include an external stylesheet. It'd be nice if 
Petal wouldn't completely reformat my css commands and html tags.... I'm 
real curious to find out what is causing this behavior.

After writing this email, I realized that I set tidy to generate xhtml 
documents. I've got Petal setup to only use the HTML parser in order to 
avoid extra calls to the xml parser while I was running under mod_cgi. Of 
course, I don't think this should matter since valid XHTML should be valid 
HTML as well as valid XML. 

Changing the parser to ANY did not help and in fact made the load time 
longer under Opera. IE continued to be unaffected. Changing the 
$Petal::PARSER value to XML caused the page to fail to load due to an XML 
error (thus I deduced that under ANY, Petal was still using the HTML 
parser). I don't know nearly enough about XML to start debugging the cause 
so am hoping to figure out the problem with the HTML parser.

Thanks,
William

-- 
 Lead Developer
 Knowmad Services Inc. || Internet Applications & Database Integration
 http://www.knowmad.com