|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 04:55:58 |
Forrest is creating UTF-8 documents, but the apache httpd
servers
don't seem configured to recognize the "meta
http-equiv" charset
declarations in the html docs, and default to latin-1.
It looks like all users of forrest at apache (save
forrest.apache.org)
are experiencing this problem.
Example: http://incu
bator.apache.org/solr/who.html which I just set up.
Notice the mangled accented "c" in Gospodnetic.
I tried to set the default charset in
http://inc
ubator.apache.org/solr/.htaccess, but it either didn't
work,
or the server would need to be rebooted.
The easiest solution would seem to be for Forrest to
generate latin-1
(and produce entities for anything out of that range). Is
there a way
to tell it to do this?
-Yonik
|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 06:15:45 |
On 1/30/06, Yonik Seeley <yseeley gmail.com> wrote:
> Forrest is creating UTF-8 documents, but the apache
httpd servers
> don't seem configured to recognize the "meta
http-equiv" charset
> declarations in the html docs, and default to latin-1.
>
> It looks like all users of forrest at apache (save
forrest.apache.org)
> are experiencing this problem.
>
> Example: http://incu
bator.apache.org/solr/who.html which I just set up.
> Notice the mangled accented "c" in
Gospodnetic.
It renders as UTF-8 for me. I believe the charset
declaration informs
the browser, not the server, what to expect.
Brian
|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 08:12:08 |
El mar, 31-01-2006 a las 00:15 -0600, Brian M Dube
escribió:
> On 1/30/06, Yonik Seeley <yseeley gmail.com> wrote:
> > Forrest is creating UTF-8 documents, but the
apache httpd servers
> > don't seem configured to recognize the "meta
http-equiv" charset
> > declarations in the html docs, and default to
latin-1.
> >
> > It looks like all users of forrest at apache (save
forrest.apache.org)
> > are experiencing this problem.
> >
> > Example: http://incu
bator.apache.org/solr/who.html which I just set up.
> > Notice the mangled accented "c" in
Gospodnetic.
>
> It renders as UTF-8 for me. I believe the charset
declaration informs
> the browser, not the server, what to expect.
>
Actually the httpd server "renders" the html pages
in "c", that is right
default configuration
http://marc.theaimsgroup.com/?t=113785471000001&a
mp;r=1&w=2
I actually run into this problem on the zone server.
http://marc.theaimsgroup.com/?l=forrest
-dev&m=113793333805926&w=2
Then I started to apply the UTF-8 settings on httpd and run
as well into
the problem of having to restart the httpd
http://marc.theaimsgroup.com/?t=113813801500002&a
mp;r=1&w=2
It sounds that you have actually tried the .htaccess entry:
AddDefaultCharset UTF-8
I am not sure whether the httpd for the incubator is
configured the same
aas the one for forrest but David pointed out that you
sometimes "Need
to enable the use of .htaccess files".
Can you try to set "AddDefaultCharset UTF-8" in
the .htaccess file,
upload it and see what it gives. If nothing happens you need
to ask on
infra whether .htaccess files are enabled.
HTH
--
thorsten
"Together we stand, divided we fall!"
Hey you (Pink Floyd)
|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 14:09:46 |
On 1/31/06, Brian M Dube <brian.dube gmail.com> wrote:
> It renders as UTF-8 for me. I believe the charset
declaration informs
> the browser, not the server, what to expect.
AFAIK, the browser uses the charset from the headers the
server sends
(and that's currently latin-1 for the incubator and many
other apache
sites).
This isn't a forrest bug... it's correctly rendering UTF-8.
I am
looking to forrest for the easiest workaround though.
-Yonik
|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 14:37:37 |
Thanks for the email links Thorsten.
On 1/31/06, Thorsten Scherler <thorsten apache.org> wrote:
> It sounds that you have actually tried the .htaccess
entry:
> AddDefaultCharset UTF-8
Yes, it had no effect. It's not my server though, so I
can't just go
and reboot it (or enable .htaccess files if they aren't
enabled).
The thing is *every* other forrest use at apache (except
forrest.apache.org) that I have run across has this problem.
It seems
like this should be fixed for everyone, not on a
case-by-case basis.
Having forrest generate latin-1 encoded html files by
default would be nice.
Is there any way to instruct forrest to do this?
-Yonik
|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 15:07:56 |
On 1/31/06, Yonik Seeley <yseeley gmail.com> wrote:
>It seems like this should be fixed for everyone, not on
a case-by-case basis.
I ran down the list of apache sites I could find that use
forrest.
All of the following have the charset in the HTTP headers
mismatching
the declared charset in the document.
http://xml.apache.org/
http://xml.apache.org
/security/
http://cocoon.apache.org/
a>
http://lenya.apache.org/
http://xmlgraphics.apa
che.org/
http://xmlgraphics
.apache.org/fop/
http://pe
ople.apache.org/~vgritsenko/stats/
http://jakarta.apache.
org/poi/
http://ws.apache.org/
http://xmlbeans.apache.or
g/
http://ws.apache.org/axis/
http://ws.apache.org/soap/
http://ws.apache.org/wsrf/
http://ws.apache.org/
pubscribe/
http://jakarta.ap
ache.org/tapestry/
http://gump.apache.org/
http://jakarta.ap
ache.org/hivemind/
http://myfaces.apache.org/
http://db.apache.org/derb
y/
http://lucene.apache.org/
a>
http://lucene.apache.
org/nutch/
http://incubator.ap
ache.org/solr/
http://incubator.a
pache.org/woden/
http://incubator.
apache.org/stdcxx/
-Yonik
|
|
| forrest output charset mismatch with
httpd |

|
2006-01-31 20:42:00 |
Yonik Seeley wrote:
> On 1/31/06, Yonik Seeley <yseeley gmail.com> wrote:
>
>>It seems like this should be fixed for everyone, not
on a case-by-case basis.
>
>
> I ran down the list of apache sites I could find that
use forrest.
Thanks for your feedback on this, we are examing this on the
dev list.
Since you have done some very useful investigation into this
can you
please open an issue and link to this mail thread in the
archives. That
way you will be notified when we work out the fix and the
issue will
stop us forgetting it.
Ross
>
> All of the following have the charset in the HTTP
headers mismatching
> the declared charset in the document.
>
> http://xml.apache.org/
> http://xml.apache.org
/security/
> http://cocoon.apache.org/
a>
> http://lenya.apache.org/
> http://xmlgraphics.apa
che.org/
> http://xmlgraphics
.apache.org/fop/
> http://pe
ople.apache.org/~vgritsenko/stats/
> http://jakarta.apache.
org/poi/
> http://ws.apache.org/
> http://xmlbeans.apache.or
g/
> http://ws.apache.org/axis/
> http://ws.apache.org/soap/
> http://ws.apache.org/wsrf/
> http://ws.apache.org/
pubscribe/
> http://jakarta.ap
ache.org/tapestry/
> http://gump.apache.org/
> http://jakarta.ap
ache.org/hivemind/
> http://myfaces.apache.org/
> http://db.apache.org/derb
y/
> http://lucene.apache.org/
a>
> http://lucene.apache.
org/nutch/
> http://incubator.ap
ache.org/solr/
> http://incubator.a
pache.org/woden/
> http://incubator.
apache.org/stdcxx/
>
> -Yonik
>
>
|
|
| forrest output charset mismatch with
httpd |

|
2006-02-01 00:49:44 |
El mar, 31-01-2006 a las 09:37 -0500, Yonik Seeley
escribió:
> Thanks for the email links Thorsten.
>
> On 1/31/06, Thorsten Scherler <thorsten apache.org> wrote:
> > It sounds that you have actually tried the
.htaccess entry:
> > AddDefaultCharset UTF-8
>
> Yes, it had no effect. It's not my server though, so I
can't just go
> and reboot it (or enable .htaccess files if they aren't
enabled).
>
Yeah, I just tried with the lenya site and you are right,
the .htaccess
has no effect.
> The thing is *every* other forrest use at apache
(except
> forrest.apache.org) that I have run across has this
problem. It seems
> like this should be fixed for everyone, not on a
case-by-case basis.
>
Yeah, we should post your list to infra. I would strongly
recommend to
set it in .htconf then because IMO UTF-8 should be used by
all apache
sites.
> Having forrest generate latin-1 encoded html files by
default would be nice.
> Is there any way to instruct forrest to do this?
>
Yeah, I do not recommend this because ISO-8859-1 is
excluding to many
extra characters and I am not sure whether this solves your
problem,
anyway here it goes. It is directly taken from forrest main
sitemap.xmap
and one can shorten this.
One way is to add the following in your sitemap.xmap
(assuming you use
"old fashion" skins and have basic cocoon
knowledge):
into <map:serializers>
<map:serializer name="html"
mime-type="text/html"
src="org.apache.cocoon.serialization.HTMLSerializer&quo
t;>
<doctype-public>-//W3C//DTD HTML 4.01
Transitional//EN</doctype-public>
<doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-syst
em>
<encoding>ISO-8859-1</encoding>
</map:serializer>
then into <map:resources>
<map:resource name="skinit">
<map:transform src="{lm:}">
<map:parameter name="notoc"
value=""/>
<!-- For backwards-compat with 0.2 - 0.4 skins -->
<map:parameter name="isfaq"
value=""/>
<map:parameter name="nopdf"
value=""/>
<map:parameter name="path"
value=""/>
<map:parameter name="config-file"
value="{project:skinconf}"/>
</map:transform>
<map:serialize/>
</map:resource>
then in <map:pipelines>
<!--pipeline that "marries" the docs in the
root dir with the skin to
produce html-->
<map:match pattern="*.html">
<map:aggregate element="site">
<map:part
src="cocoon:/skinconf.xml"/>
<map:part
src="cocoon:/build-info"/>
<map:part src="cocoon:/tab-"/>
<map:part src="cocoon:/menu-"/>
<map:part src="cocoon:/body-"/>
</map:aggregate>
<map:call resource="skinit">
<map:parameter name="type"
value="transform.site.xhtml"/>
<map:parameter name="path"
value=""/>
</map:call>
</map:match>
<!--pipeline that "marries" the docs in all
other dirs then root with
the skin to produce html-->
<map:match pattern="**/*.html">
<map:aggregate element="site">
<map:part
src="cocoon:/skinconf.xml"/>
<map:part
src="cocoon:/build-info"/>
<map:part
src="cocoon://tab-.html"/>
<map:part
src="cocoon://menu-.html"/>
<map:part
src="cocoon://body-.html"/>
</map:aggregate>
<map:call resource="skinit">
<map:parameter name="type"
value="transform.site.xhtml"/>
<map:parameter name="path"
value=""/>
</map:call>
</map:match>
> -Yonik
salu2
--
thorsten
"Together we stand, divided we fall!"
Hey you (Pink Floyd)
|
|
| forrest output charset mismatch with
httpd |

|
2006-02-01 02:11:07 |
Ross Gardler wrote:
> Yonik Seeley wrote:
> >Yonik Seeley wrote:
> >
> >>It seems like this should be fixed for
everyone, not on a case-by-case
> >>basis.
> >
> >I ran down the list of apache sites I could find
that use forrest.
>
> Thanks for your feedback on this, we are examing this
on the dev list.
> Since you have done some very useful investigation into
this can you
> please open an issue and link to this mail thread in
the archives. That
> way you will be notified when we work out the fix and
the issue will
> stop us forgetting it.
I don't think that there is anything for us to fix.
This is an "Appche HTTP Server" configuration
issue.
> >All of the following have the charset in the HTTP
headers mismatching
> >the declared charset in the document.
> >
> >http://xml.apache.org/
> >http://xml.apache.org
/security/
> >http://cocoon.apache.org/
a>
> >http://lenya.apache.org/
> >http://xmlgraphics.apa
che.org/
> >http://xmlgraphics
.apache.org/fop/
> >http://pe
ople.apache.org/~vgritsenko/stats/
> >http://jakarta.apache.
org/poi/
> >http://ws.apache.org/
> >http://xmlbeans.apache.or
g/
> >http://ws.apache.org/axis/
> >http://ws.apache.org/soap/
> >http://ws.apache.org/wsrf/
> >http://ws.apache.org/
pubscribe/
> >http://jakarta.ap
ache.org/tapestry/
> >http://gump.apache.org/
> >http://jakarta.ap
ache.org/hivemind/
> >http://myfaces.apache.org/
> >http://db.apache.org/derb
y/
> >http://lucene.apache.org/
a>
> >http://lucene.apache.
org/nutch/
> >http://incubator.ap
ache.org/solr/
> >http://incubator.a
pache.org/woden/
> >http://incubator.
apache.org/stdcxx/
The .htaccess at forrest.apache.org has
"AddDefaultCharset UTF-8".
Using wget i see that we have proper match betwwen HTTP
headers
and the Content-Type heclared in the head section of our
html.
I tried temporarily adding to Cocoon's .htaccess and that
fixed theirs.
Using wget against incubator.apache.org/solr/ shows the same
as Forrest. So i don't see what the problem is.
-David
|
|
| forrest output charset mismatch with
httpd |

|
2006-02-01 04:11:58 |
On 1/31/06, David Crossley <crossley apache.org> wrote:
> I don't think that there is anything for us to fix.
> This is an "Appche HTTP Server" configuration
issue.
Agreed. This isn't a forrest bug. It might be a nice
feature to be
able to output latin-1 or ascii html though.
> Using wget i see that we have proper match betwwen HTTP
headers
> and the Content-Type heclared in the head section of
our html.
>
> I tried temporarily adding to Cocoon's .htaccess and
that
> fixed theirs.
>
> Using wget against incubator.apache.org/solr/ shows the
same
> as Forrest. So i don't see what the problem is.
It works for me now too.
Something must have changed... browser cache, server
restart,
.htaccess only read periodically?
So Solr is OK now, but what of all the other sites?
Here's an interesting link:
http://padawan.info/web/debugging_ch
arset_encoding_mismatch_with_apache.html
-Yonik
|
|