List Info

Thread: Decoupling patch ids from metadata




Decoupling patch ids from metadata
user name
2006-12-21 13:48:20
On Wed, Dec 20, 2006 at 03:21:51PM +0100, Juliusz Chroboczek
wrote:
> >> Why does darcs allow me to create a corrupt
repo?  Is there any way
> >> to prevent this from happening in the first
place?
> 
> > It's been a wishlist request for a while (I
believe).  We don't want to
> > have a complete check by default, since that's an
O(N) process where N is
> > the length of the repository history (unless we
did something tricky), but
> > a check at the most recent patches would catch
most such mistakes, and be
> > cheap.
> 
> We've discussed this at FOSDEM last, and you know I
don't agree.
> 
> IMHO, the issue is with the way Darcs generates patch
ids -- for some
> reason, a patch id is generated as a hash of the patch
metadata.
> 
> Now if you look at the properties that a patch id must
have, there are
> only two:
> 
>   (1) a patch id must be invariant w.r.t. commutation;
>   (2) a patch id must be globally unique.
> 
> The current approach passes (2), but fails (1).  A
simpler approach
> would be to generte a patch id randomly; given a
sufficiently large
> space (and a little care in generating random numbers),
this would
> pass (1), and if the patch id is encoded in the patch
itself, it would
> also pass (2).
> 
> David, I would like to argue that making patch ids
arbitrary should be
> combined with the transition to hashed repositories. 
We would simply
> need to:
> 
>   - add the patch id to the on-disk patch when it's not
generated with
>     the current algorithm;
>   - add patch ids to inventories.
> 
> What do you think?

I think that there's no reason to couple this change with a
repository
format change.  All we need to do is add an extra line to
the "long
comment" portion of the current patch ID, and tell
darcs not to display it
to the user (unless perhaps they ask).  Then older darcs can
interact fine
with newer darcs, and there's no format transition at all.

This is also nice, in that it gives us a chance to introduce
a convention
for hiding darcs information in the long comment (e.g. lines
at the
beginning starting with "darcs-internal-" or
something like that.  So
perhaps we just have "darcs-internal-patchid:
DDGDSGDSG" as the first (or
last?) line of the long comment.  Then when we want to add a
hash of the
repo contents in tags (as in, the files and directories), we
could do that
in the same way "darcs-internal-pristine-hash:
xxxxxx".  We just need to
teach darcs to hide such lines (a one-line modification to
human_friendly),
and to generate them (which requires reading /dev/urandom or
something).
-- 
David Roundy
http://www.darcs.net

_______________________________________________
darcs-users mailing list
darcs-usersdarcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
Decoupling patch ids from metadata
user name
2006-12-21 15:43:07
> This is also nice, in that it gives us a chance to
introduce a convention
> for hiding darcs information in the long comment (e.g.
lines at the
> beginning starting with "darcs-internal-" or
something like that.  So
> perhaps we just have "darcs-internal-patchid:
DDGDSGDSG" as the first (or
> last?) line of the long comment.  Then when we want to
add a hash of the
> repo contents in tags (as in, the files and
directories), we could do that
> in the same way "darcs-internal-pristine-hash:
xxxxxx".  We just need to
> teach darcs to hide such lines (a one-line modification
to human_friendly),

I don't know if this sounds too heavy handed, but you could
use
something like libuuid1 to generate the unique ids.  I
believe that is a
bit more 'correct' than just reading /dev/urandom alone. 
But it would
mean adding another dependency.

--
Zachary P. Landau <kapheinedivineinvasion.net>
GPG: gpg --recv-key 0xC9F82052 | http://divinei
nvasion.net/kapheine.asc
_______________________________________________
darcs-users mailing list
darcs-usersdarcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
Decoupling patch ids from metadata
user name
2006-12-22 17:02:21
[David Roundy <droundydarcs.net>, Thu, 21 Dec
2006 05:48:20 -0800]:
> We just need to teach darcs to hide such lines (a
one-line
> modification to human_friendly), and to generate them
(which
> requires reading /dev/urandom or something).

Just a thought: In this case, shouldn't darcs also escape
any
occurrences of these lines that the user may have entered as
part of
the description?

If the user does not see darcs-internal-patchid in normal
operation,
she may well be tempted to use this string herself.  In
general,
hidden "magic" strings are a bad idea IMO,
especially in a tool that
many people may want to use as a basis for implementing
their own
protocols.

Albert.

_______________________________________________
darcs-users mailing list
darcs-usersdarcs.net

http://lists.osuosl.org/mailman/listinfo/darcs-users
Decoupling patch ids from metadata
user name
2006-12-23 13:14:26
On Fri, Dec 22, 2006 at 06:02:21PM +0100, Albert Reiner
wrote:
> [David Roundy <droundydarcs.net>, Thu, 21 Dec
2006 05:48:20 -0800]:
> > We just need to teach darcs to hide such lines (a
one-line
> > modification to human_friendly), and to generate
them (which
> > requires reading /dev/urandom or something).
> 
> Just a thought: In this case, shouldn't darcs also
escape any
> occurrences of these lines that the user may have
entered as part of
> the description?
> 
> If the user does not see darcs-internal-patchid in
normal operation,
> she may well be tempted to use this string herself.  In
general,
> hidden "magic" strings are a bad idea IMO,
especially in a tool that
> many people may want to use as a basis for implementing
their own
> protocols.

Well, it won't hurt darcs at all if users *do* add such a
string.  If we
really were concerned about this, we could easily add an
escaping mechanism
to distinguish user-provided darcs-internal lines from
darcs-added ones.
-- 
David Roundy
http://www.darcs.net
_______________________________________________
darcs-users mailing list
darcs-usersdarcs.net

http://lists.osuosl.org/mailman/listinfo/darcs-users
Decoupling patch ids from metadata
user name
2006-12-27 22:27:51
>> We've discussed this at FOSDEM last, and you know I
don't agree.

>> IMHO, the issue is with the way Darcs generates
patch ids -- for some
>> reason, a patch id is generated as a hash of the
patch metadata.

> I think that there's no reason to couple this change
with a repository
> format change.  All we need to do is add an extra line
to the "long
> comment" portion of the current patch ID, and tell
darcs not to display it
> to the user (unless perhaps they ask).

I think we're thinking of different things.

You're arguing for the minimal change that will solve the
patch
unicity problem -- just adding some random junk to the data
being
hashed over to assure unicity, and hiding it in the UI.

I'm thinking of a more pervasive change -- decoupling
completely the
patch id from the metadata.  Both the patch and the
inventory would
contain an explicit patch id, which would be generated by
the patch
originator.

In other words, I'm thinking of adding the patch id as an
explicit
field to the PatchInfo structure.

                                        Juliusz
_______________________________________________
darcs-users mailing list
darcs-usersdarcs.net

http://lists.osuosl.org/mailman/listinfo/darcs-users
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )