[Petal] More on entities and Â

Grant McLean grant at mclean.net.nz
Mon May 3 21:35:13 BST 2004


William McKee wrote:

 > On Tue, May 04, 2004 at 07:09:05AM +1200, Grant McLean wrote:
 >
 >>If your output encoding is UTF8 then every character beyond
 >>0x7F will be two or more bytes.  The non-breaking space
 >>character should be A2 A0 (I think).  So as long as you give
 >>the browser the correct charset setting in your headers, it
 >>should do exactly the right thing.
 >
 >
 > Hi Grant,
 >
 > Thanks for the quick response. How do I know what my output
 > encoding is?

It will be UTF8 unless you do something to change it.
For example (assuming Perl 5.8):

   my $html = $template->process (%args);
   open($fh,'>:encoding(iso-8859-1)', $path) or die "open($path): $!";
   $fh->print($html);

 > I can set the encoding of the file and the meta tag.

Yes, this tells the browser how it should interpret the document:

   <meta http-equiv="Content-type" content="text/html; charset=utf-8">

Obviously this needs to match the encoding used to create the file.

 > Should I be modifying the configuration of my Apache server?

The 'meta http-equiv' tag above is equivalent to sending this header:

   Content-type: text/html; charset=utf-8

I've heard that not all browsers honour the charset suffix on the 
Content-type header so it might not be worth the effort.  The meta
tag has the advantage of staying with the document if the user does a 
'Save-as', whereas the HTTP header would be lost.

 > The output I'm getting right now is C2A0. According to this table[1],
 > nbsp is 00A0 and A2A0 is not defined.

Oops, my bad, I meant C2A0 but inexplicably typed A2A0.

Cheers
Grant




More information about the Petal mailing list