List Info

Thread: RE: Mixing character sets




RE: Mixing character sets
user name
2008-03-06 22:43:08
> I think the issue is with characters, such as smart
quotes, derived
from Microsoft Word

Yes, actually, I've run into this before with another
application come
to think of it.  Win-1252 code page is a real pain. I ended
up
downgrading all characters within a certain byte range of
1252 to
alternative low-byte ISO-8859-1 characters instead of
allowing a high
byte character to ever be persisted to the database.  If
you're using a
WYSIWG editor, this might be something that it can be
configured to
cleanup / strip / convert for you -- otherwise, you can
always try the
paste into notepad, then copy and paste into Bricolage as a
work-around.


RE: Mixing character sets
user name
2008-03-07 12:36:49
> Yes, actually, I've run into this before with another
application come
> to think of it.  Win-1252 code page is a real pain. I
ended up

And this guy has experienced this pain as well:

h
ttp://linuxplanet.com/linuxplanet/opinions/3749/1/

He quotes "Perl super-hacker" Tom Christiansen
referring to these
characters as "intentional errors designed to destroy
the web by
subverting open standards and thus secure Microsoft's
hegemony." Heh.

> to think of it.  Win-1252 code page is a real pain. I
ended up
> downgrading all characters within a certain byte range
of 1252 to
> alternative low-byte ISO-8859-1 characters instead of
allowing a high
> byte character to ever be persisted to the database. 
If you're using
a
> WYSIWG editor, this might be something that it can be
configured to
> cleanup / strip / convert for you -- otherwise, you can
always try the
> paste into notepad, then copy and paste into Bricolage
as a
work-around.

I instruct people to save to plain text from Word and
replace special
characters. But, this is not guaranteed, so I guess it would
be safest
to replace these characters with the proper entity.

However, I tested
$burner->set_encoding('encoding(windows-1252)'); with
the server default charset as iso-8859-1, and it works (at
least for me
on Firefox 2.0 and IE7). Though, I realize this might not
work for all
browsers, so I'll probably go with the substitution option.

Chris

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )