Hi Mike,
Depending on the platform, different
> transformation formats are used. On Windows QB uses
UTF-16 for the UI and
> certain parts of the library. Communication with the
server and shared
> library code is done using UTF-8.
Does that mean that when typing queries into the edit pane,
the characters
are stored somewhere in memory in UTF-16?
This is what I understand of the parameters pertaining to
character sets:
character_set_database: the encoding that is used to store
the data in the
database
character_set_results: the encoding used to send the results
back to the
client
character_set_connection: the encoding used to send the
client's commands
over the connection to the server
character_set_client: the encoding the client is using.
The last one is the one I'm having confusion with. The way I
understand this
variable's role is that if a file for example, saved in
UCS-2, was piped
into the mysql command line client, this variable would have
to be set to
indicate the encoding to mysql. If the connection variable
was set as UTF-8,
then a conversion from UCS-2 to UTF-8 would have to take
place before
sending the command to mysql.
So the way I view (maybe incorrectly) QB is that the data
displayed in the
edit pane has an encoding, and this encoding is indicated
by
character_set_client. The client here is query browser. When
I press
ctrl-enter this data is transcoded to
character_set_connection (if
required), and sent to the DB.
So if character_set_client is set to UTF-8 (which it is in
QB), this implies
to me that the client, ie: the edit pane, is sending data to
the connection
in UTF-8 encoding. From what you've said it seems like this
is UTF-16, which
would not agree with anything that I've thought so far ;).
Is my
interpretation correct, or am I going wrong somewhere?
> Having said that, why can I type in characters using
MS
> > pinyin editor into query editing area at the top,
and even
> > though the characters in the editing pane are
stuffed, they
> > get inserted correctly (checked by using HEX())?
>
> I don't understand your question. Why can you do that?
Because we have it
> implemented so. That's why you can do that. If your
characters are too
> close
> together (which is likely with scripts that have wide
[socalled full
> width]
> characters, like chinese) you can set a bigger spacing
between the letters
> using the application's option. However note that half
width characters as
> well as other scripts like latin1, russian, greek etc.
will have a bit too
> wide spacing then. The editors in QB use fixed width
fonts, hence all
> characters have the same width, regardless of their
actual glyph.
It was a problem with the font - I mustn't have been using a
unicode font.
> Although they get inserted correctly, when I execute
select
> > *, the chinese characters display correctly but
they are
> > rotated to the left 90 degrees (I'm using Arial MS
Unicode
> > font) - does anyone else have this problem?
>
> That is strange and I can only assume that you've
picked the wrong font.
> Note that chinese originally is a vertical writing
system so there were
> attempts to allow vertical writing also on computers.
For this to work
> special fonts are installed which usually appear with a
leading sign.
> These fonts have rotated glyphs.
You were correct. Any font that I chose with a leading sign
appeared
rotated in both the code and the results sections. Picking a
font without an
sign resulted in font that was not rotated.
Taras
|