List Info

Thread: problem with HTMLHelp and Turkish content




problem with HTMLHelp and Turkish content
user name
2007-09-24 17:54:44
I'm having a problem building HTML Help output with Turkish
content 
(lang="tr").   I'm hoping someone has more
experience with this issue.

Turkish requires using windows-1254 encoding instead of
windows-1252.  The 
xsltproc processor handles 1254, so I am able to customize
the XSL to 
output:

<meta http-equiv="Content-Type"
content="text/html; charset=windows-1254">

into the HTML files, and compile them using HTML Help
Workshop.  The text 
displays correctly in the main window and in the TOC.

The one problem remaining is that the Index window contains
some incorrect 
characters.  The  index includes indexterm elements and the
document 
titles.  It is clear when comparing the 1254 and 1252
encodings that the 
incorrect characters are coming from the codepoints in 1252
instead of 
1254.  For example, "small dotless i" (0xFD in
Windows 1254) is replaced 
with "small y acute" (0xFD in Windows 1252).

The help index comes from param elements like this contained
in <object> 
elements in the HTML output:

<param name="Keyword" value="My turkish
title">

It seems when the help compiler collects this data, it loses
the connection 
with the windows-1254 meta information that was at the top
of the HTML file 
it came from.   I have not found a way to specify that the
keyword index 
should be handled in the 1254 encoding.

Has anyone else seen this problem, and found a solution?

Bob Stayton
Sagehill Enterprises
DocBook Consulting
bobssagehill.net




------------------------------------------------------------
---------
To unsubscribe, e-mail: docbook-apps-unsubscribelists.oasis-open.org
For additional commands, e-mail: docbook-apps-helplists.oasis-open.org


Re: problem with HTMLHelp and Turkish content
user name
2007-09-24 20:01:33
The HTML Help compiler has some real limitations when it comes to character sets and the toc/index; it's amazing that the compiler hasn't been updated in such a long time.

I haven't yet found a good way to configure the character set for the index but you can change the font -- is it possible the default index font doesn't support Turkish? You can change the font with something like the following near the top of the hhk file (in the body section):

<OBJECT type=";text/site properties">
 &nbsp;  <param name=";Font" value=&quot;Tahoma,8,0">
</OBJECT>

The compiler is apparently not Unicode enabled -- I found this link with issues other people had with Japanese (obviously different that Turkish but maybe there are some clues there...):
http://www.helpware.net/FAR/far_faq.htm

Hope that helps,
Ken


On 9/24/07, Bob Stayton < bobssagehill.net">bobssagehill.net> wrote:
I'm having a problem building HTML Help output with Turkish content
(lang="tr").&nbsp;  I'm hoping someone has more experience with this issue.

Turkish requires using windows-1254 encoding instead of windows-1252. &nbsp;The
xsltproc processor handles 1254, so I am able to customize the XSL to
output:

<meta http-equiv="Content-Type&quot; content=&quot;text/html; charset=windows-1254&quot;>

into the HTML files, and compile them using HTML Help Workshop.&nbsp; The text
displays correctly in the main window and in the TOC.

The one problem remaining is that the Index window contains some incorrect
characters.  ;The &nbsp;index includes indexterm elements and the document
titles. ; It is clear when comparing the 1254 and 1252 encodings that the
incorrect characters are coming from the codepoints in 1252 instead of
1254.&nbsp; For example, "small dotless i" (0xFD in Windows 1254) is replaced
with "small y acute"; (0xFD in Windows 1252).

The help index comes from param elements like this contained in <object>
elements in the HTML output:

&lt;param name=";Keyword&quot; value=&quot;My turkish title";>

It seems when the help compiler collects this data, it loses the connection
with the windows-1254 meta information that was at the top of the HTML file
it came from. ;  I have not found a way to specify that the keyword index
should be handled in the 1254 encoding.

Has anyone else seen this problem, and found a solution?

Bob Stayton
Sagehill Enterprises
DocBook Consulting
bobssagehill.net">bobssagehill.net




---------------------------------------------------------------------
To unsubscribe, e-mail: docbook-apps-unsubscribelists.oasis-open.org">docbook-apps-unsubscribelists.oasis-open.org
For additional commands, e-mail: docbook-apps-helplists.oasis-open.org">docbook-apps-helplists.oasis-open.org


Re: problem with HTMLHelp and Turkish content
user name
2007-10-16 15:27:10
Rob Cavicchio (rcavicchiomvps.org) kindly
volunteered to test my Turkish 
HTML Help files on a Windows system set up as a Turkish
system.  It turns 
out that when the HTML Help files are compiled on this
Turkish system, then 
the Index and Search window panes display the correct
windows-1254 
characters, whereas when the same files were compiled on an
English Windows 
system, they displayed some incorrect characters.

Fortunately, the CHM file that was compiled on the Turkish
system displays 
correctly on an English system.  So if you want a completely
correct 
Turkish HTML Help file, you need to output with encoding
windows-1254 and 
compile it on a Turkish Windows system.  Then you can
distribute it to any 
system.

Thanks, Rob, for your help on this.

Bob Stayton
Sagehill Enterprises
DocBook Consulting
bobssagehill.net


----- Original Message ----- 
From: "Bob Stayton" <bobssagehill.net>
To: "DocBook Apps" <docbook-appslists.oasis-open.org>
Sent: Monday, September 24, 2007 3:54 PM
Subject: [docbook-apps] problem with HTMLHelp and Turkish
content


> I'm having a problem building HTML Help output with
Turkish content 
> (lang="tr").   I'm hoping someone has more
experience with this issue.
>
> Turkish requires using windows-1254 encoding instead of
windows-1252. 
> The xsltproc processor handles 1254, so I am able to
customize the XSL to 
> output:
>
> <meta http-equiv="Content-Type"
content="text/html; 
> charset=windows-1254">
>
> into the HTML files, and compile them using HTML Help
Workshop.  The text 
> displays correctly in the main window and in the TOC.
>
> The one problem remaining is that the Index window
contains some 
> incorrect characters.  The  index includes indexterm
elements and the 
> document titles.  It is clear when comparing the 1254
and 1252 encodings 
> that the incorrect characters are coming from the
codepoints in 1252 
> instead of 1254.  For example, "small dotless
i" (0xFD in Windows 1254) 
> is replaced with "small y acute" (0xFD in
Windows 1252).
>
> The help index comes from param elements like this
contained in <object> 
> elements in the HTML output:
>
> <param name="Keyword" value="My
turkish title">
>
> It seems when the help compiler collects this data, it
loses the 
> connection with the windows-1254 meta information that
was at the top of 
> the HTML file it came from.   I have not found a way to
specify that the 
> keyword index should be handled in the 1254 encoding.
>
> Has anyone else seen this problem, and found a
solution?
>
> Bob Stayton
> Sagehill Enterprises
> DocBook Consulting
> bobssagehill.net
>
>
>
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: docbook-apps-unsubscribelists.oasis-open.org
> For additional commands, e-mail: docbook-apps-helplists.oasis-open.org
>
>
> 



------------------------------------------------------------
---------
To unsubscribe, e-mail: docbook-apps-unsubscribelists.oasis-open.org
For additional commands, e-mail: docbook-apps-helplists.oasis-open.org


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )