On Tue, 25 Mar 2008, Alexander V. Chernikov wrote:
> I have made patches solving first 4 problems These
patches are available at
> http://ipfw.ru/patches/
unionfs2.diff fixes fs mounting onto upper layer,
> unionfs_lmount.diff fixes lower unionfs_threads.diff
and unionfs_unix.diff
> fixes cases 2) and 3) unionfs_rename.diff fixes case
with renaming
>
> Can anybody comment/review ?
Dear Alexander,
Unfortunately, I don't know too much about unionfs.
However, I can comment on
the UNIX domain socket patch:
> --- sys/fs/unionfs/union_subr.c.orig 2008-03-13
23:10:32.000000000 +0300
> +++ sys/fs/unionfs/union_subr.c 2008-03-13
23:17:34.000000000 +0300
>  -160,6 +160,8 
> unp->un_path[cnp->cn_namelen] = ' ';
> }
> vp->v_type = (uppervp != NULLVP ?
uppervp->v_type : lowervp->v_type);
> + if (vp->v_type == VSOCK)
> + vp->v_socket = (uppervp != NULLVP) ?
uppervp->v_socket : lowervp->v_socket;
> if ((lowervp != NULLVP) &&
(lowervp->v_type == VDIR))
> vp->v_mountedhere = lowervp->v_mountedhere;
> vp->v_data = unp;
I'm a bit worried about this assignment, as it represents an
untracked alias
for the socket. Let me explain why:
UNIX domain sockets may have file system bindings, allowing
them to use the
file system namespace as a rendezvous for communication.
Typical use is that
a socket is created, bind() is called on it with a path in
some location like
/var/run/log. Other processes turn up and connect() to the
path, causing a
file system lookup to reach the vnode of the socket, and
then the socket code
follows vp->v_socket to find the socket to connect to.
When a bound socket is
closed, we follow a back-pointer from the UNIX domain socket
to the vnode, and
then clear the pointer. Doing this in a race-free manner is
somewhat tricky,
and I'm not 100% convinced it's correct currently, although
it appears to be
somewhat close to right.
The upshot of all this is that if you copy the pointer value
to other vnodes,
such as vnodes on upper layer, the UNIX domain socket code
won't clear those
pointers before freeing the socket they point at. This
means that the above
code snippet may lead to a v_socket pointer on a higher
layer vnode pointing
at the right socket, the wrong socket, or possibly some
other bit of freed and
maybe reused memory.
You can imagine a number of schemes to replicate pointer
changes around or
track the various outstanding references, but I think a more
fundamental
question is whether this is in fact the right behavior at
all. The premise of
is that writes flow up, but not down, and
"connections" to sockets are
read-write events, not read events, most typically. If
you're using unionfs
to take a template system and "broadcast it" to
many jails, you probably don't
want all the jails talking to the same syslogd, you want
them each talking to
their own. When syslogd in a jail finds a disconnected
socket, which is
effectively what a NULL v_socket pointer means, in
/var/run/log, it should be
unlinking it and creating a new socket, not reusing the
existing file on disk.
Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-fs freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to
"freebsd-fs-unsubscribe freebsd.org"
|