|
List Info
Thread: Advanced internationalization
|
|
| Advanced internationalization |
  Germany |
2007-06-27 07:33:38 |
Hey all,
Genshi now has basic support for internationalization [2],
which in
combination with Babel [1] works rather nicely AFAICT.
However there's one problem that isn't addressed yet, namely
that of
messages that may contain tags. This is a complicated issue,
compounded by Genshi's striving to do correct escaping of
strings in
templates. That means you can't just have messages like the
following:
msgid "Here's a <a
href='#foobar'>link</a>."
The <a> tag would be escaped, and I think that's the
right thing to
do, because translations may very well contain things that
*do* need
to be escaped, and the translators shouldn't have to worry
about
escaping -- they may not even know what escaping is.
So we need a proper solution for this issue. I've outlined a
possible
approach in:
<h
ttp://genshi.edgewall.org/ticket/129#comment:2>
To summarize, I propose adding an i18n namespace, which
would be
processed exclusively by the Translator filter. That
namespace
provides tags to define exactly how a message is composed
from mixed
content. Please see the ticket linked above for details.
I'd love to hear your thoughts on this, and maybe
alternative proposals.
[1] http://genshi.edgewall.org/wiki/Documentation/i18n.html
a>
[2] http://babel.edgewall.org/
Thanks,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
| Re: Advanced internationalization |
  Germany |
2007-06-27 10:45:14 |
Christopher Lenz wrote:
> Hey all,
>
> Genshi now has basic support for internationalization
[2], which in
> combination with Babel [1] works rather nicely AFAICT.
>
> However there's one problem that isn't addressed yet,
namely that of
> messages that may contain tags. This is a complicated
issue,
> compounded by Genshi's striving to do correct escaping
of strings in
> templates. That means you can't just have messages like
the following:
>
> msgid "Here's a <a
href='#foobar'>link</a>."
>
> The <a> tag would be escaped, and I think that's
the right thing to
> do, because translations may very well contain things
that *do* need
> to be escaped, and the translators shouldn't have to
worry about
> escaping -- they may not even know what escaping is.
>
There's a closely related issue which is how will we deal
with similar
messages built from within the Python code using the
genshi.builder.
Example from Trac:
tag.p("You can ",
tag.a("search", href=req.href.log(path, rev=rev,
mode='path_history')),
" in the repository history to see if that path
existed but"
" was later removed")
There are actually 2 distinct problems here:
1. how to collect the msgid from the Python source?
2. how to compose the msgid in a non fragmented way?
> So we need a proper solution for this issue. I've
outlined a possible
> approach in:
>
> <h
ttp://genshi.edgewall.org/ticket/129#comment:2>
>
> To summarize, I propose adding an i18n namespace, which
would be
> processed exclusively by the Translator filter. That
namespace
> provides tags to define exactly how a message is
composed from mixed
> content. Please see the ticket linked above for
details.
>
> I'd love to hear your thoughts on this, and maybe
alternative proposals.
>
This approach looks very promising and could perhaps be
extended to the
genshi.builder situation.
In particular, for point 2. we could imagine using a few
helper
functions that would inject the appropriate attribute from
the i18n
namespace into the Element argument.
The above example becomes:
i18n_message(tag.p("You can ",
i18n_tag('search', tag.a("search",
href=req.href.log(path, rev=rev,
mode='path_history'))),
" in the repository history to see if that path
existed but"
" was later removed"))
i18n_message would also build the msgid by including the
plain text from
static strings (dynamic strings should be wrapped in
i18_param() calls)
and return the translation.
_But_ there's still the problematic point 1, and I'm not
sure how the
current extract_python() could be extended to handle that...
One idea
could be to track nested calls and have the possibility to
register
callbacks for each keyword, so the callback for i18_message
could
rebuild the tag expression. Well, this looks tedious, so I
hope there's
a simpler way.
-- Christian
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
| Re: Advanced internationalization |
  Germany |
2007-06-27 11:10:39 |
Am 27.06.2007 um 17:45 schrieb Christian Boos:
> Christopher Lenz wrote:
>> Hey all,
>>
>> Genshi now has basic support for
internationalization [2], which in
>> combination with Babel [1] works rather nicely
AFAICT.
>>
>> However there's one problem that isn't addressed
yet, namely that of
>> messages that may contain tags. This is a
complicated issue,
>> compounded by Genshi's striving to do correct
escaping of strings in
>> templates. That means you can't just have messages
like the
>> following:
>>
>> msgid "Here's a <a
href='#foobar'>link</a>."
>>
>> The <a> tag would be escaped, and I think
that's the right thing to
>> do, because translations may very well contain
things that *do* need
>> to be escaped, and the translators shouldn't have
to worry about
>> escaping -- they may not even know what escaping
is.
>
> There's a closely related issue which is how will we
deal with similar
> messages built from within the Python code using the
genshi.builder.
>
> Example from Trac:
>
> tag.p("You can ",
> tag.a("search", href=req.href.log(path,
rev=rev,
> mode='path_history')),
> " in the repository history to see if that path
existed but"
> " was later removed")
>
> There are actually 2 distinct problems here:
> 1. how to collect the msgid from the Python source?
> 2. how to compose the msgid in a non fragmented way?
You're absolutely right, that's a problem the proposal
doesn't
address, and I also don't have a good idea so far how to
solve it :-/
Well, one approach would be to move more of that kind of
stuff into
actual templates, but of course that's not always
appropriate. On the
other hand, Trac *does* too often put markup into exception
messages,
I think.
Cheers,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
| Re: Advanced internationalization |

|
2007-06-27 11:13:06 |
Hi,
I'm trying to translate trac i18n branch to Japanese.
But not yet familar with genshi and babel.
I've read the proposal and having some questions.
(These are not Japanese specific issue)
2007/6/27, Christopher Lenz <cmlenz gmx.de>:
> However there's one problem that isn't addressed yet,
namely that of
> messages that may contain tags. This is a complicated
issue,
> compounded by Genshi's striving to do correct escaping
of strings in
> templates. That means you can't just have messages like
the following:
>
> msgid "Here's a <a
href='#foobar'>link</a>."
>
> The <a> tag would be escaped, and I think that's
the right thing to
> do, because translations may very well contain things
that *do* need
> to be escaped, and the translators shouldn't have to
worry about
> escaping -- they may not even know what escaping is.
1. Translating attribute values
-------------------------------
Eacaping may good for text of content, but we should
translate
button text also. But the proposal does not mention about
attribute
text. I think the proposal is expecting explicit directive
to be
extracted. Is it extracted without directive automaticaly?
I think we may need one more i18n xx
attribute to specify attribute
names to be extracted.
For example (with Japanese):
<input type="submit" value="Reply"
title="Reply to comment ${change.cnum}"
i18n:attributes="title value" i/>
=>
msgid="Reply"
msgstr="返信"
msgid="Reply to comment ${change.cnum}"
msgid="${change.cnum}へのコメント"
2. How to deal parameter in attribute?
--------------------------------------
In example above, i18n:param cannot be used for attribute
value.
How about using parameter name as-is in msgid/msgstr?
3. i18n:tag might be required feature
-------------------------------------
I think i18n:tag should be REQUIRED (at least when having
multiple tags
in msgstr) because the changing order of tags is always
happen.
And nested tags may be separated in translated text, and
vice versa.
How about giving auto index number? (no need to give
i18n:tag)
It always appeared in msgid and it can be used in msgstr.
ex:
msgid="Please see [1:Help] for [2:details]."
msgstr="[2 etails]
finden Sie unter [1:Hilfe]."
--
Shun-ichi GOTO
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
| Re: Advanced internationalization |
  Germany |
2007-06-27 11:29:28 |
Am 27.06.2007 um 18:13 schrieb Shun-ichi GOTO:
> Hi,
>
> I'm trying to translate trac i18n branch to Japanese.
> But not yet familar with genshi and babel.
>
> I've read the proposal and having some questions.
> (These are not Japanese specific issue)
>
> 2007/6/27, Christopher Lenz <cmlenz gmx.de>:
>> However there's one problem that isn't addressed
yet, namely that of
>> messages that may contain tags. This is a
complicated issue,
>> compounded by Genshi's striving to do correct
escaping of strings in
>> templates. That means you can't just have messages
like the
>> following:
>>
>> msgid "Here's a <a
href='#foobar'>link</a>."
>>
>> The <a> tag would be escaped, and I think
that's the right thing to
>> do, because translations may very well contain
things that *do* need
>> to be escaped, and the translators shouldn't have
to worry about
>> escaping -- they may not even know what escaping
is.
>
> 1. Translating attribute values
> -------------------------------
>
> Eacaping may good for text of content, but we should
translate
> button text also. But the proposal does not mention
about attribute
> text. I think the proposal is expecting explicit
directive to be
> extracted. Is it extracted without directive
automaticaly?
In general, there are a couple of attribute values that are
extracted
by default, such as "title" and "alt".
Actually, these should only be
extracted/translated automatically if they contain literal
strings,
but I'll have to check (and probably fix) the code in that
respect.
> I think we may need one more i18n xx
attribute to specify attribute
> names to be extracted.
> For example (with Japanese):
>
> <input type="submit"
value="Reply" title="Reply to comment $
> {change.cnum}"
> i18n:attributes="title value" i/>
>
> =>
In this case what you really should do is use gettext
explicitly:
<input type="submit"
value="${_('Reply')}"
title="${_('Reply to comment %(num)s') %
{'num':
change.cnum}}" />
I don't see the need to add anything in the proposed i18n
namespace
to handle this situation.
> 2. How to deal parameter in attribute?
> --------------------------------------
>
> In example above, i18n:param cannot be used for
attribute value.
> How about using parameter name as-is in msgid/msgstr?
I'm not sure I understand this one. Does the above answer it
maybe?
> 3. i18n:tag might be required feature
> -------------------------------------
>
> I think i18n:tag should be REQUIRED (at least when
having multiple
> tags
> in msgstr) because the changing order of tags is always
happen.
You mean when the original string in the template is
updated?
> And nested tags may be separated in translated text,
and vice versa.
Hm, really? Do you have an example for that? Translations
changing
the order I can understand, but the nesting?
> How about giving auto index number? (no need to give
i18n:tag)
> It always appeared in msgid and it can be used in
msgstr.
> ex:
> msgid="Please see [1:Help] for
[2:details]."
> msgstr="[2 etails]
finden Sie unter [1:Hilfe]."
Yeah, that's actually more convenient and consistent. If we
do it
this way, we actually won't need i18n:tag at all, AFAICT.
Thanks,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
| Re: Advanced internationalization |

|
2007-06-27 13:18:14 |
2007/6/28, Christopher Lenz <cmlenz gmx.de>:
>
> > 1. Translating attribute values
> > -------------------------------
> >
> > Eacaping may good for text of content, but we
should translate
> > button text also. But the proposal does not
mention about attribute
> > text. I think the proposal is expecting explicit
directive to be
> > extracted. Is it extracted without directive
automaticaly?
>
> In general, there are a couple of attribute values that
are extracted
> by default, such as "title" and
"alt". Actually, these should only be
> extracted/translated automatically if they contain
literal strings,
> but I'll have to check (and probably fix) the code in
that respect.
OK. It's helpful.
> > I think we may need one more i18n xx
attribute to specify attribute
> > names to be extracted.
> > For example (with Japanese):
> >
> > <input type="submit"
value="Reply" title="Reply to comment $
> > {change.cnum}"
> > i18n:attributes="title value"
i/>
> >
> > =>
>
> In this case what you really should do is use gettext
explicitly:
>
> <input type="submit"
value="${_('Reply')}"
> title="${_('Reply to comment %(num)s') %
{'num':
> change.cnum}}" />
>
> I don't see the need to add anything in the proposed
i18n namespace
> to handle this situation.
OK, I see.
> > 2. How to deal parameter in attribute?
> > --------------------------------------
> >
> > In example above, i18n:param cannot be used for
attribute value.
> > How about using parameter name as-is in
msgid/msgstr?
>
> I'm not sure I understand this one. Does the above
answer it maybe?
Yes, it's enough.
> > 3. i18n:tag might be required feature
> > -------------------------------------
> >
> > I think i18n:tag should be REQUIRED (at least when
having multiple
> > tags
> > in msgstr) because the changing order of tags is
always happen.
>
> You mean when the original string in the template is
updated?
>
> > And nested tags may be separated in translated
text, and vice versa.
>
> Hm, really? Do you have an example for that?
Translations changing
> the order I can understand, but the nesting?
As a simplest example, the sentence S+V+O in English will
be
translated as S+O+V in Japanese in generally.
So, as an example:
<em>S <a
href="xxx">V</a></em> O
would be translated into
<em>S</em> O <a
href="xxx">V</a>
or
<em>S</em> O <em><a
href="xxx">V</a></em>
Of course the translator can make effort to keep original
structure of
nesting, but it is not always a good sentence in his
language. To be
better translation, the translator might want to change the
structure,
I guess.
--
Shun-ichi GOTO
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
| Re: Advanced internationalization |
  Germany |
2007-06-28 01:51:01 |
Shun-ichi GOTO wrote:
> ...
> As a simplest example, the sentence S+V+O in English
will be
> translated as S+O+V in Japanese in generally.
>
> So, as an example:
> <em>S <a
href="xxx">V</a></em> O
> would be translated into
> <em>S</em> O <a
href="xxx">V</a>
> or
> <em>S</em> O <em><a
href="xxx">V</a></em>
>
> Of course the translator can make effort to keep
original structure of
> nesting, but it is not always a good sentence in his
language. To be
> better translation, the translator might want to change
the structure,
> I guess.
>
What about the following?
''S [xxx V]'' O
translated to:
''S'' O [xxx V]
or
''S'' O ''[xxx V]''
Oh I forgot, we're not talking /only/ about Trac
-- Christian
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "Genshi" group.
To post to this group, send email to genshi googlegroups.com
To unsubscribe from this group, send email to
genshi-unsubscribe googlegroups.com
For more options, visit this group at http://gr
oups.google.com/group/genshi?hl=en
-~----------~----~----~----~------~----~------~--~---
|
|
[1-7]
|
|