Nicolas Grilly wrote:
> It seems the decode tool doesn't decode the filename of
> uploaded files.
>
> Is it the intented behavior? If not, how to change it
to correctly
> decode the filename, and thus convert it to unicode?
>
> I guess we should add the decoding code in the function
decode_params
> in file lib/encoding.py.
I wouldn't mind decoding the filename, as long as the
charset is either
1) unambiguously declared in the payload, or 2) explicitly
declared by
the developer.
AFAIK, declaring charset in the payload is already defined
for
multipart/form-data. http://www.ietf.o
rg/rfc/rfc2388.txt says:
The original local file name may be supplied as well,
either as a
"filename" parameter either of the
"content-disposition: form-data"
header or, in the case of multiple files, in a
"content-disposition:
file" header of the subpart. The sending application
MAY supply a
file name; if the file name of the sender's operating
system is not
in US-ASCII, the file name might be approximated, or
encoded using
the method of RFC 2231.
and http://www.ietf.o
rg/rfc/rfc2231.txt says:
Specifically, an asterisk at the end of a parameter name
acts as an
indicator that character set and language information may
appear at
the beginning of the parameter value. A single quote is
used to
separate the character set, language, and actual value
information in
the parameter value string, and an percent sign is used
to flag
octets encoded in hexadecimal. For example:
Content-Type: application/x-stuff;
title*=us-ascii'en-us'This%20is%20%2A%2A%2Afun%2A%2A%2A
...so implementing that should be straightforward. However,
I'd be
surprised if your user-agent is doing this. Is it?
In the absence of explicit declaration in the payload, the
only hope is
for the developer to use or override the default of
US-ASCII. If you
just use the default, there's little point in adding this to
CP, since
unicode(val) uses the default encoding for Python, which
tends to be
ASCII anyway:
>>> unicode('xF3')
Traceback (most recent call last):
File "<interactive input>", line 1, in
?
UnicodeDecodeError: 'ascii' codec can't decode byte
0xf3
in position 0: ordinal not in range(128)
Robert Brewer
System Architect
Amor Ministries
fumanchu amor.org
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the
Google Groups "cherrypy-users" group.
To post to this group, send email to cherrypy-users googlegroups.com
To unsubscribe from this group, send email to
cherrypy-users-unsubscribe googlegroups.com
For more options, visit this group at h
ttp://groups.google.com/group/cherrypy-users?hl=en
-~----------~----~----~----~------~----~------~--~---
|