|
List Info
Thread: What if darcs is generating too inclusive of hunks?
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-09 16:30:36 |
I've been using darcs to manage my php web app for a while
with great
success. However I recently made a ton of changes to a file
and not
darcs is generating hunks that include too many changes
(and, in some
cases, would duplicate many lines in the file). I am using
v1.0.8.
E.g. I changed this code:
$smarty->assign("m_action", $_REQUEST["m_action"]);
switch ( $_REQUEST["m_action"]) {
case "summarize-calls-by-interviewer":
case "summarize-calls-by-project":
to this:
extract(R("m_action"), EXTR_OVERWRITE);
$smarty->assign("m_action", $m_action);
switch ($m_action) {
case "analyze-todays-calls":
[... a ton of new code ...]
case "summarize-calls-by-interviewer":
case "summarize-calls-by-project":
I want to generate a patch that includes only this:
- $smarty->assign("m_action", $_REQUEST["m_action"]);
- switch ( $_REQUEST["m_action"]) {
+ extract(R("m_action"), EXTR_OVERWRITE);
+ $smarty->assign("m_action", $m_action);
+ switch ($m_action) {
But darcs is generating a hunk that includes the code that
I've
shortened to [... a ton of code ...] above because it finds
the
"summarize-calls-by-interviewer" text below it. I
know I should have
made these changes separately, but it's a little late now
Is
there a
way to make darcs be less aggressive/exhaustive while it is
diff'ing?
Drew Vogel
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-09 19:35:27 |
On Thu, Nov 09, 2006 at 10:30:36 -0600, Drew Vogel wrote:
> But darcs is generating a hunk that includes the code
that I've shortened to [... a ton of code ...] above because
it finds the "summarize-calls-by-interviewer" text
below it. I know I should have
> made these changes separately, but it's a little late
now
Is there a way to make darcs be less aggressive/exhaustive
while it is diff'ing?
Fine tuning is difficult on the algorithm level. See also
the bug[s]
on the tracker requesting the addition of a hunk editor
option,
whereby you can edit the diff lines that are included in the
patch.
The real solution is to darcs get the repo to a /tmp dir,
edit
the parts you want, and record. then you can copy your file
to a
new place in your original working dir, revert, pull from
the tmp
repo, and then move the file over the updated one, and
record the
rest of the change.
It's cumbersome, but I do this often, I have the same
problem as you
--
Yuval Kogman <nothingmuch woobling.org>
http://nothingmuch.wo
obling.org 0xEBD27418
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-09 22:04:31 |
On Thu, Nov 09, 2006 at 09:35:27PM +0200, Yuval Kogman
wrote:
> On Thu, Nov 09, 2006 at 10:30:36 -0600, Drew Vogel
wrote:
> > But darcs is generating a hunk that includes the
code that I've
> > shortened to [... a ton of code ...] above because
it finds the
> > "summarize-calls-by-interviewer" text
below it. I know I should
> > have made these changes separately, but it's a
little late now
> > Is there a
way to make darcs be less aggressive/exhaustive
> > while it is diff'ing?
[...]
> The real solution is to darcs get the repo to a /tmp
dir, edit
> the parts you want, and record. then you can copy your
file to a
> new place in your original working dir, revert, pull
from the tmp
> repo, and then move the file over the updated one, and
record the
> rest of the change.
I usually make a plain backup copy of the file, delete the
unwanted changes in the original, record the wanted changes,
and
restore the deleted changes from the backup. Sometimes I
just
delete the unwanted changes, record, and use undo in the
text
editor to get the deleted change back (but that's a little
scary).
But a way to split hunks up in the record dialogue would be
awesome.
--
Tommy Pettersson <ptp lysator.liu.se>
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-10 16:41:10 |
|
On 11/9/06, Yuval Kogman < nothingmuch woobling.org">nothingmuch woobling.org> wrote:
On Thu, Nov 09, 2006 at 10:30:36 -0600, Drew Vogel wrote:
> But darcs is generating a hunk that includes the code that I've shortened to [... a ton of code ...] above because it finds the "summarize-calls-by-interviewer" text below it. I know I should have
> made these changes separately, but it's a little late now Is there a way to make darcs be less aggressive/exhaustive while it is diff'ing? I have also experienced cases where darcs seems to generate patches with too many "false" changes.
Imagine that I was holding a repository in Darcs and Subversion at the
same time. Then, I think it is reasonable to say that, if the patches
Darcs generates are more noisy than the ones CVS and Subversion
generates, then that should be considered a bug in Darcs. I have never
had any notable problems with the way that Subversion's diff works,
which is why I think it is a good benchmark. This is analogous to the
GHC performance metric (If GHC is not faster than everything else, then
a bug should be filed).
Fine tuning is difficult on the algorithm level. See also the bug[s] on the tracker requesting the addition of a hunk editor option,
whereby you can edit the diff lines that are included in the patch. I do not know if it is possible to improve the diff algorithm, but if it is, it would be much better to improve the diff algorithm so that such a tool was not needed.
The real solution is to darcs get the repo to a /tmp dir, edit the parts you want, and record. then you can copy your file to a
new place in your original working dir, revert, pull from the tmp repo, and then move the file over the updated one, and record the rest of the change.
It's cumbersome, but I do this often, I have the same problem as you
 That means at least three of us have recognized that the diff mechanism is not working correctly. Let's collect some examples and file a bug.
- Brian
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-10 17:04:35 |
On 10-11-06 17:41, Brian Smith wrote:
> On 11/9/06, *Yuval Kogman* <nothingmuch woobling.org
> <mailto:nothingmuch woobling.org>> wrote:
[snip]
> The real solution is to darcs get the repo to a
/tmp dir, edit
> the parts you want, and record. then you can copy
your file to a
> new place in your original working dir, revert,
pull from the tmp
> repo, and then move the file over the updated one,
and record the
> rest of the change.
>
> It's cumbersome, but I do this often, I have the
same problem as you
>
>
>
> That means at least three of us have recognized that
the diff mechanism
> is not working correctly. Let's collect some examples
and file a bug.
For the record: there is no such thing as a correct diff
mechanism.
Diffs can usually be made in lots of different ways, even
minimal diffs.
For example: if I have a function f like so
-----------------------
function f()
{
}
-----------------------
and I append a new function g
-----------------------
function g()
{
}
-----------------------
then the diff usually will be
-----------------------
}
function g()
{
-----------------------
which is not a nice diff (for someone who knows a
programming language),
but it is minimal.
Moral: creating 'nice' diffs is an arcane art, and not a
matter of being
correct or not (or even minimal or not).
(That said, it would be nice if the diff algorithm in darcs
gave nicer
results in more cases. And/or if you can interactively
influence the
algorithm.)
Groetjes,
<><
Marnix
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-10 17:39:08 |
On Fri, Nov 10, 2006 at 10:41:10AM -0600, Brian Smith wrote:
> I have also experienced cases where darcs seems to
generate patches with
> too many "false" changes.
>
> Imagine that I was holding a repository in Darcs and
Subversion at the same
> time. Then, I think it is reasonable to say that, if
the patches Darcs
> generates are more noisy than the ones CVS and
Subversion generates, then
> that should be considered a bug in Darcs. I have never
had any notable
> problems with the way that Subversion's diff works,
which is why I think it
> is a good benchmark. This is analogous to the GHC
performance metric (If GHC
> is not faster than everything else, then a bug should
be filed).
What do you mean by noisy? Or "false" changes? I
have no idea what you are
talking about (which would make it hard to fix).
--
David Roundy
Dept. of Physics
Oregon State University
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-10 20:58:35 |
David Roundy wrote:
> What do you mean by noisy? Or "false"
changes? I have no idea what you are
> talking about (which would make it hard to fix).
While I can not say about that, the problem experienced by
the original
poster is something I have also experienced very often.
Basically, when I am making changes to code (new feature), I
might also
make other changes near it which are unrelated (minor
bug-fix). When it
comes time to record darcs will (correctly) generate a diff
containing
both changes in a single hunk, and I have to manually edit
the file to
undo one of the changes, record it, and redo the changes and
record the
second change.
In many cases the changes do not depend on each other, and
in these
cases it would be great if darcs allowed us to break the
hunk into parts
which can be recorded separately.
Regards,
LL
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-12 14:18:29 |
On Fri, Nov 10, 2006 at 09:39:08 -0800, David Roundy wrote:
> What do you mean by noisy? Or "false"
changes? I have no idea what you are
> talking about (which would make it hard to fix).
Here is an example I get often:
hunk ./lib/MO/Compile/Attribute/Simple/Compiled.pm 75
-sub _generate_accessor_method {
+sub slot {
Shall I record this change? (7/?) [ynWsfqadjkc], or ? for
help: y
hunk ./lib/MO/Compile/Attribute/Simple/Compiled.pm 78
- my $slot = ( $self->slots )[0];
+ my slots = $self->slots;
+
+ warn "Attribute $self has more than one slot,
but the
generated accessor will only use the first one" if slots
> 1;
+
+ return $slots[0];
+}
+
+sub _generate_accessor_method {
+ my $self = shift;
The patch I would have expected is [hand typed]:
+sub slot {
+ my slots = $self->slots;
+
+ warn "Attribute $self has more than one slot,
but the
generated accessor will only use the first one" if slots
> 1;
+
+ return $slots[0];
+}
sub _generate_accessor_method {
my $self = shift;
- my $slot = ( $self->slots )[0];
+ my $slot = $self->slot;
which represents much more closely the actual change. Since
this
always seemed to me like nitpicking which is very hard to
solve I
never complained.
--
Yuval Kogman <nothingmuch woobling.org>
http://nothingmuch.wo
obling.org 0xEBD27418
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-12 15:47:00 |
On Sun, Nov 12, 2006 at 04:18:29PM +0200, Yuval Kogman
wrote:
> On Fri, Nov 10, 2006 at 09:39:08 -0800, David Roundy
wrote:
>
> > What do you mean by noisy? Or "false"
changes? I have no idea what you are
> > talking about (which would make it hard to fix).
>
> Here is an example I get often:
>
> hunk ./lib/MO/Compile/Attribute/Simple/Compiled.pm 75
> -sub _generate_accessor_method {
> +sub slot {
> Shall I record this change? (7/?) [ynWsfqadjkc], or ?
for help: y
> hunk ./lib/MO/Compile/Attribute/Simple/Compiled.pm 78
> - my $slot = ( $self->slots )[0];
> + my slots = $self->slots;
> +
> + warn "Attribute $self has more than one
slot, but the
> generated accessor will only use the first one" if
slots > 1;
> +
> + return $slots[0];
> +}
> +
> +sub _generate_accessor_method {
> + my $self = shift;
>
>
> The patch I would have expected is [hand typed]:
>
>
> +sub slot {
> + my slots = $self->slots;
> +
> + warn "Attribute $self has more than one
slot, but the
> generated accessor will only use the first one" if
slots > 1;
> +
> + return $slots[0];
> +}
>
> sub _generate_accessor_method {
> my $self = shift;
> - my $slot = ( $self->slots )[0];
> + my $slot = $self->slot;
I suspect your hand-typed version is not quite correct: I
think both
subs start with the line 'my $self = shift'. When Darcs
encounters
that line in the new function, it matches it up with the
existing
identical line. Darcs knows nothing about program structure,
and so
doesn't know that it's moved the existing line into a
different
function. Either set of edits look equally good, and it has
no way of
knowing that you'd prefer the latter.
Whether Darcs can be taught is another matter. It might help
if the
diff algorithm at least favoured breaking hunks on blank
lines
(actually, I can see an argument for /always/ starting a new
hunk when
hitting a blank line). Maybe it could additionally avoid
breaking the
indent structure if possible. I have a feeling stuff like
that changes
the complexity from polynomial to exponential though...
-- Jamie Webb
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
| What if darcs is generating too
inclusive of hunks? |

|
2006-11-12 16:31:52 |
On Sun, Nov 12, 2006 at 04:18:29PM +0200, Yuval Kogman
wrote:
> On Fri, Nov 10, 2006 at 09:39:08 -0800, David Roundy
wrote:
> > What do you mean by noisy? Or "false"
changes? I have no idea what you are
> > talking about (which would make it hard to fix).
>
> Here is an example I get often:
>
> hunk ./lib/MO/Compile/Attribute/Simple/Compiled.pm 75
> -sub _generate_accessor_method {
> +sub slot {
> Shall I record this change? (7/?) [ynWsfqadjkc], or ?
for help: y
> hunk ./lib/MO/Compile/Attribute/Simple/Compiled.pm 78
> - my $slot = ( $self->slots )[0];
> + my slots = $self->slots;
> +
> + warn "Attribute $self has more than one
slot, but the
> generated accessor will only use the first one" if
slots > 1;
> +
> + return $slots[0];
> +}
> +
> +sub _generate_accessor_method {
> + my $self = shift;
>
>
> The patch I would have expected is [hand typed]:
>
>
> +sub slot {
> + my slots = $self->slots;
> +
> + warn "Attribute $self has more than one
slot, but the
> generated accessor will only use the first one" if
slots > 1;
> +
> + return $slots[0];
> +}
>
> sub _generate_accessor_method {
> my $self = shift;
> - my $slot = ( $self->slots )[0];
> + my $slot = $self->slot;
>
>
> which represents much more closely the actual change.
Since this
> always seemed to me like nitpicking which is very hard
to solve I
> never complained.
In your example you left out a + on the space between the
two subroutines.
But it does look like in this example darcs doesn't generate
a true LCS, as
even with the added line, darcs is generating a non-minimal
diff. This
would be a result of a performance improvement, which was
added because a
true LCS takes O(N^2) time (which is problematic for people
with
multi-megabyte text files). Perhaps we should have a
heuristic that allows
us to use a true LCS on smaller files. (Note that if there
was a
whitespace change in the _generate_accessor_method line,
then darcs did
make an optimal choice). The algorithm darcs now uses, by
the way, is the
same one used by GNU diff. (But not original diff, which
used a true LCS,
but scaled worse than GNU diff.)
There's also often an ambguity in deciding which lines were
added where,
and there's no way to determine what the user means. We've
tried to use a
heuristic to (among equivalently minimal descriptions of a
change) select a
reasonable change, but it's tricky, as the "best"
choice is often very
different, depending on
--
David Roundy
Dept. of Physics
Oregon State University
_______________________________________________
darcs-users mailing list
darcs-users darcs.net
http://www.abridgegame.org/mailman/listinfo/darcs-users
a>
|
|
|
|