hi samizdat-devel,
i think the bug + fix below could solve a practical problem
for many
non-english speaking indymedia collectives or other
independent media
groups: "activist spam" which someone posts as
identical articles, in
English, on several dozen different local indymedia sites.
This sort
of article is sometimes serious and sometimes more like
conspiracy
theory, but AFAIK the people doing it usually have
"en-US" in their
browser http Accept-Language header. If the mono option is
enabled
by sysadmin and the user chooses this option:
https://savan
nah.nongnu.org/patch/?6167
and if his/her preferred language is non-English, then s/he
will
not even notice the presence of the "activist
spam" article.
This could possibly imply less intervention or less urgent
intervention
is needed by moderators (depending on the editorial policy,
of course):
the decision and filtering of what to read (ignoring
non-preferred languages
rather than just not preferring them) is made by the reader,
not by
an editorial collective de facto deciding on behalf of the
whole local
activist community. (Of course, ignoring real spam is not a
good idea.
For that we have the Antispam class in antispam.rb .)
Anyway, read on if you're interested.
cheers
boud
[bug #20932] locale2lang-0.1 - BUG + fix: fallback to
language_only
extracted from Accept-Language http header is needed
URL:
<http://sav
annah.nongnu.org/bugs/?20932>
Summary: locale2lang-0.1 - BUG + fix:
fallback to
language_only extracted from Accept-Language http header is
needed
Project: Samizdat
Submitted by: boud
Submitted on: Wednesday 08/29/2007 at 22:04
Category: None
Severity: 3 - Normal
Status: Works For Me
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
PROBLEM:
Even though RFC 2616 recommends that user clients (e.g.
firefox)
should recommend to their users to have a backup generic
language
without a country code (e.g. "en" in addition to
"en-US"),
in practice most users do not do this.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.
html#sec14.4
In particular, for non-english language samizdat sites,
this
means that people who have only "en-US" sent by
their browser
end up getting the default local language of the site.
Their
article then gets published with message.language = the
local
language, not "en", since formally speaking, they
state that
they prefer "en-US" to the local language, but
they are not
interested in "en".
This implies that if people want to add a local translation
rather than hiding an article, then moderator intervention
is required to change the language (unless the user chose
open editing).
Moreover, the monolanguage patch https://savan
nah.nongnu.org/patch/?6167
(still under
development) will fail to exclude these type of articles
under
the mono option, since their language is wrongly tagged
(except
for a pedantic interpretation of their request).
For these reasons, i'm putting this as a bug (with a
proposed
fix) rather than a patch.
PROPOSED SOLUTION:
This requires a reasonably modern version of ruby gettext,
e.g.
debian 1.7.0-1 or later. Copying gettext/locale_object.rb
into an older installation and using an appropriate require
statement is a hack to avoid a full installation of a
recent
gettext.
The idea is that if a requested accept-language in the list
is not found, then parse off the language part of it and
try
that instead. This could potentially create multiple
entries
of the same language, but i suspect that shouldn't be a
problem.
--- s070818/samizdat/lib/samizdat/engine/request.rb
2007-08-14
01:16:53.000000000 +0200
+++ /usr/lib/ruby/1.8/samizdat/engine/request.rb
2007-08-29
23:02:06.869866760 +0200
 -165,8
+173,17 
accept.scan(/([^ ,;]+)(?:;q=([^ ,;]+))?/).collect
{|l, q|
[l, (q ? q.to_f : 1.0)]
}.sort_by {|l, q| -q }.each {|l, q|
- accept_language.push l if config_lang.include? l
+# accept_language.push l if config_lang.include? l
+ if config_lang.include? l
+ accept_language.push l
+ else
+ # try converting full locale (language tag) to
ISO-639 language
only
+ # http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.
html#sec14.4
+ lang_only = Locale::Object.new(l).language
+ accept_language.push lang_only if
config_lang.include? lang_only
+ end
}
+
# lang cookie overrides Accept-Language
lang = cookie('lang') and config_lang.include? lang
and
accept_language.unshift lang
FUTURE EXTENSIONS:
The relations between human languages and how close or
distant
they are are well studied. A measure of the distance
between
different languages could potentially be used as a backup
to
find the likely closest language that a user would prefer
rather than just taking what is considered the
"language"
part of the locale/Accept-Language string.
Since the "narratives" which claim different
national identities
often try to claim sharp distinctions between closely
related
languages, this could potentially be a quite politically
sensitive issue. This is not surprising, and is not IMHO an
argument against doing this: an RDF engine specifically
aimed
for grassroots, non-authoritarian media is necessarily
going
to challenge artificial linguistic barriers if it's to get
somewhere near doing its task.
In any case, users with their own notions of language
preferences
would still be able to state this by all the presently
available
methods; adding a language metric would only be used as a
fallback.
COMMENT:
The Locale:: module could probably also be used to check
the
config files for valid languages and warn about invalid
languages/locales.
_______________________________________________________
File Attachments:
-------------------------------------------------------
Date: Wednesday 08/29/2007 at 22:04 Name:
070829_locale2lang-0.1 Size: 997B
By: boud
<http://savannah.nongnu.org/bugs/download.php?file_
id=13832>
_______________________________________________________
Reply to this item at:
<http://sav
annah.nongnu.org/bugs/?20932>
_______________________________________________
Message sent via/by Savannah
http://savannah.nongnu.or
g/
_______________________________________________
samizdat-devel mailing list
samizdat-devel nongnu.org
http://lists.nongnu.org/mailman/listinfo/samizdat-devel
a>
|