List Info

Thread: Path object design




Path object design
user name
2006-11-04 02:09:47
At 01:56 AM 11/4/2006 +0100, Andrew Dalke wrote:
>os.join assumes the base is a directory
>name when used in a join: "inserting '/' as
needed" while RFC
>1808 says
>
>            The last segment of the base URL's path
(anything
>            following the rightmost slash "/",
or the entire path if no
>            slash is present) is removed
>
>Is my intuition wrong in thinking those should be the
same?

Yes.  

Path combining and URL absolutization(?) are inherently
different 
operations with only superficial similarities.  One reason
for this is that 
a trailing / on a URL has an actual meaning, whereas in
filesystem paths a 
trailing / is an aberration and likely an actual error.

The path combining operation says, "treat the following
as a subpath of the 
base path, unless it is absolute".  The URL
normalization operation says, 
"treat the following as a subpath of the location the
base URL is 
*contained in*".

Because of this, os.path.join assumes a path with a trailing
separator is 
equivalent to a path without one, since that is the only
reasonable way to 
interpret treating the joined path as a subpath of the base
path.

But for a URL join, the path /foo and the path /foo/ are not
only 
*different paths* referring to distinct objects, but the
operation wants to 
refer to the *container* of the referenced object.  /foo
might refer to a 
directory, while /foo/ refers to some default content (e.g. 
index.html).  This is actually why Apache normally redirects
you from /foo 
to /foo/ before it serves up the index.html; relative URLs
based on a base 
URL of /foo won't work right.

The URL approach is designed to make peer-to-peer linking in
a given 
directory convenient.  Instead of referring to './foo.html'
(as one would 
have to do with filenames, you can simply refer to
'foo.html'.  But the 
cost of saving those characters in every link is that
joining always takes 
place on the parent, never the tail-end.  Thus directory
URLs normally end 
in a trailing /, and most tools tend to automatically
redirect when 
somebody leaves it off.  (Because otherwise the links would
be wrong.)

_______________________________________________
Python-Dev mailing list
Python-Devpython.org
ht
tp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/p
ython-dev/nessto%40sharedlog.com
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )