[MKDoc-modules] Re: Encoding issues

Paul Arzul patricka at mkdoc.com
Fri Oct 10 17:59:46 BST 2003


On Mon 06-Oct-2003 at 01:39:54PM -0400, William McKee wrote:
> 
> Like I said, tidy is to blame for this format. Interestingly, if I
> remove those CDATA sections, I don't get encoded double-quotes. I guess
> that means that MKDoc::XML is encoding what it thinks to be a string.
> I'm not sure about how CDATA sections are used but wonder if this is the
> proper behavior.

yes, tidy is guilty, and the whole thing is a mess. i don't know how my test
cases i've crafted should behave...

read this[1,2] for more gory details. the fix that tidy doesn't do (and i
think should is[2]):

---8<---
Script tags:
Due to the way comments are dealt with in XHTML , this can cause problems for
script tags which use <!-- *stuff* --> to hide scripts. As outlined in the
XHTML spec:
<http://www.w3.org/TR/xhtml1/#diffs>

A script tag might do better in the form:
<script>
<![CDATA[
... unescaped script content (except > becomes &gt; ) ...
]]>
</script>

Hence when tidying HTML to XHTML, tidy should probably 
replace:
<script><!--
... unescaped script ...
-->
</script>

with:
<script><![CDATA[
... unescaped script content (except > becomes &gt; ) ...
]]>
</script>
--->8---

- p

1.
http://lists.w3.org/Archives/Public/html-tidy/2000JulSep/0323.html

2.
http://sourceforge.net/tracker/?group_id=27659&atid=390963&func=detail&aid=427826



More information about the MKDoc-modules mailing list