List Info

Thread: Documentation of abs(3), div(3) etc.




Documentation of abs(3), div(3) etc.
country flaguser name
Germany
2007-02-06 18:21:55
Hi,

the manpages for abs(3) and its variants define behavior for
the "most negative
integer" whereas the standard explicity states that the
behavior is undefined
if the result cannot be represented.

This is a lie anyway because the code looks like

	return a < 0 ? -a : a;

whereas it obviously means

	return a < 0 ? -(unsigned)a : a;

This would yield the documented behavior but of course it
doesn't change the
standard.

For div(3) et al. a similar issue is not documented at all.

	div(x, 0) -> undefined behavior
	div(INT_MIN, -1) -> undefined behavior

The latter is only true if the implementation uses two's
complement i.e.
-INT_MAX > INT_MIN. The standard isn't as explicit about
this case but just
states that unrepresentable results cause undefined
behavior. Maybe I've
overseen some case, otherwise I'd prefer being explicit
about this because the
reader might easily assume it's an esoteric and irrelevant
caveat.

I have attached a patch for the manpages. I've also changed
"BUGS" to "CAVEAT"
because these aren't bugs, it's broken by design.

-- 
Christian

  
Re: Documentation of abs(3), div(3) etc.
user name
2007-02-07 12:33:39
Christian Biere wrote:
> Hi,
> 
> the manpages for abs(3) and its variants define
behavior for the "most negative
> integer" whereas the standard explicity states
that the behavior is undefined
> if the result cannot be represented.
> 
> This is a lie anyway because the code looks like
> 
> 	return a < 0 ? -a : a;
> 
> whereas it obviously means
> 
> 	return a < 0 ? -(unsigned)a : a;

The C standard may say that the behavior is undefined, but
that doesn't 
prevent an implementation from defining it. Assuming that
the developers 
who wrote the man pages had checked the compiler's
documentation as well 
as the documentation of all the hardware platforms NetBSD
supports, and 
assuming further that all these platforms behave the same,
there is 
nothing wrong with that documentation.

Of course, there should be test suites for these functions,
making sure 
that the behavior indeed matches the documentation.

Roland

> I have attached a patch for the manpages. I've also
changed "BUGS" to "CAVEAT"
> because these aren't bugs, it's broken by design.

You mean, two's complement is broken by design? ;)

Roland

Re: Documentation of abs(3), div(3) etc.
country flaguser name
South Africa
2007-02-09 09:29:32
On Wed, 07 Feb 2007, Martijn van Buul wrote:
> * Christian Biere:
> > the manpages for abs(3) and its variants define
behavior for the
> > "most negative integer" whereas the
standard explicity states that
> > the behavior is undefined if the result cannot be
represented.

I'd like out man page to both document what we do, and
document what
parts of that are non-standard.  At present, out abs(3)
seems to
accurately document what we do, but not that part of it is
undefined by
the standard.

> > This is a lie anyway because the code looks like
> >
> > 	return a < 0 ? -a : a;
> >
> > whereas it obviously means
> >
> > 	return a < 0 ? -(unsigned)a : a;
> 
> This is plain nonsense, on multiple grounds. First of
all, you're casting
> a signed int (known to be negative) to an unsiged int,
which is pretty
> broken to begin with, secondly, you're trying to negate
the resulting
> unsigned number, which isn't any better.

The suggested replacement code is correct.  Unsigned
arithmetic in C is
defined in terms of modular arithmetic in mathematics.

The original code would invoke undefined behaviour if it
appeared in
user-written code.  (The mathematical result of -a might be
outside the
range representable by a signed int, which gives undefined
bahaviour.)
Since the code in queation is part of the implementation, I
think we
don't need to worry about that.

--apb (Alan Barrett)

Re: Documentation of abs(3), div(3) etc.
country flaguser name
South Africa
2007-02-09 15:07:32
On Fri, 09 Feb 2007, Martijn van Buul wrote:
> >> > This is a lie anyway because the code
looks like
> >> >
> >> > 	return a < 0 ? -a : a;
> >> >
> >> > whereas it obviously means
> >> >
> >> > 	return a < 0 ? -(unsigned)a : a;
> >> This is plain nonsense, on multiple grounds.
[...]
> > The suggested replacement code is correct. 
>
> It is not. It is nonsensical, in that it is in effect
the same as the
> supposedly "broken" code.

Yes, the effect is the same as the supposedly broken code,
so it's
pointless.  I thought you were saying that it would yield
incorrect
results, and that's what I was disagreeing with.

--apb (Alan Barrett)

Re: Documentation of abs(3), div(3) etc.
country flaguser name
United States
2007-02-09 16:07:20
Martijn van Buul <pinodohd.org> writes:
> My point was that the proposed change indicates is
pointless, doesn't
> change a single opcode, and obviously indicates a lack
of understanding.
>
> If you really think that
>
> signed int a;
> return (signed int) ( - (unsigned)a );
>
> is in any better than
>
> signed int a;
> return -a;
>
> then I kindly suggest you catch up with how C works.

Your comment may be technically correct, but it is also
amazingly
impolitely phrased. I suggest that you apologize.

Perry

Re: Documentation of abs(3), div(3) etc.
country flaguser name
Sweden
2007-02-10 08:47:42
On Fri, 9 Feb 2007, Martijn van Buul wrote:

> My point was that the proposed change indicates is
pointless, doesn't
> change a single opcode, and obviously indicates a lack
of understanding.

It potentially does, depending on the surrounding code and
the compiler
you are using.


> If you really think that
>
> signed int a;
> return (signed int) ( - (unsigned)a );
>
> is in any better than
>
> signed int a;
> return -a;
>
> then I kindly suggest you catch up with how C works.

It is.  Integer types must not wrap in C

   ISO/IEC 9899:1999 6.5p5:
   If an exceptional condition occurs during the evaluation
of an
   expression (that is, if the result is not mathematically
defined
   or not in the range of representable values for its
type), the
   behavior is undefined.

so

   signed int a;
   return -a;

gives you undefined behavior if a = INT_MIN.  The compiler
may therefore
do anything (the canonical example is erase your hard disk,
but the more
realistic is doing optimizations based on assuming a !=
INT_MIN).


For

   signed int a;
   return (signed int) ( - (unsigned)a );

on the other hand, you do the operation on an unsigned type,
which
is permitted by the standard.

   ISO/IEC 9899:1999 6.2.5p9:
   [...] A computation involving unsigned operands can never
overflow,
   because a result that cannot be represented by the
resulting unsigned
   integer type is reduced modulo the number that is one
greater than the
   largest value that can be represented by the resulting
type.

You also do conversions between possibly unrepresentable
values.
This is completely defined in one direction

   ISO/IEC 9899:1999 6.3.1.3p2:
   Otherwise, if the new type is unsigned, the value is
converted by
   repeatedly adding or subtracting one more than the
maximum value
   that can be represented in the new type until the value
is in the
   range of the new type.

and it is implementation-defined in the other direction

   ISO/IEC 9899:1999 6.3.1.3p3:
   Otherwise, the new type is signed and the value cannot be
represented
   in it; either the result is implementation-defined or an
implementation-
   defined signal is raised.

i.e. you need to check the documentation for your compiler,
but I think
that all compilers on the market do what you would naively
expect.


And this is not abstract language layering pedantry -- it
does affect
real code as compilers get better at doing value range
analysis, whole
program optimizations, etc., and there have recently been
looooong
discussions on the gcc lists where people have been
surprised that gcc
has eliminated their "wrap-around" tests of the
type "if (a+100 < a)"
(which would not have happened if it was done using unsigned
arithmetics).
See e.g. the gcc bug report

   htt
p://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475

and the different discussions at

   http://gcc.gnu.org
/ml/gcc/2006-12/

(search for "wrap" and "overflow" in the
subject).

     /Krister

Re: Documentation of abs(3), div(3) etc.
country flaguser name
Sweden
2007-02-10 09:24:03
On Fri, 9 Feb 2007, Alan Barrett wrote:

>>> the manpages for abs(3) and its variants define
behavior for the
>>> "most negative integer" whereas the
standard explicity states that
>>> the behavior is undefined if the result cannot
be represented.
>
> I'd like out man page to both document what we do, and
document what
> parts of that are non-standard.  At present, out abs(3)
seems to
> accurately document what we do, but not that part of it
is undefined by
> the standard.

But the current code does in fact invoke undefined behavior
when
called with the value of the most negative integer.  

And gcc does recognize abs() and will generate code directly
and optimize 
it using the definition as written in the standard.  I.e.
the libc abs() 
will in general not be called, so users cannot rely on any
"extensions" 
anyway.


I think it is better to update the man page to say the same
thing as
the standard, instead of uglifying the implementation with
casts
and crippling gcc in random ways...

    /Krister

Re: Documentation of abs(3), div(3) etc.
country flaguser name
Sweden
2007-02-10 21:52:11
On Sat, 10 Feb 2007, Martijn van Buul wrote:

> Not sure if this is a real improvement, and it
*certainly* doesn't
> make things work all of a sudden. There is one thing
the implementation-
> defined behaviour cannot do: Suddenly make -INT_MIN
fito a signed int.

Please re-read the whole of my previous mail closely.  The
point is to 
disprove your statement:

> My point was that the proposed change indicates is
pointless, doesn't
> change a single opcode, and obviously indicates a lack
of understanding.

by showing that abs(INT_MIN) is undefined in our current
implementation, 
but that it becomes defined (i.e. as abs(INT_MIN) ==
INT_MIN) by the 
proposed change.

(But I want to point out that it would be silly change the
implementation 
to make abs(INT_MIN) defined.  It is much better to just say
that the 
value is undefined as Christian Biere proposed.)



> In all likelyness, the result will be the same: After
>
> signed int foo = abs(INT_MIN);
>
> foo will probably end up containing an unchanged
INT_MIN - which is
> what the manpage is hinting at 

Kind of.  If you directly do a
"printf("%dn", foo);", it will
probably print the value of INT_MIN.  But let us look at a
slightly
more interesting example.

Consider the following code containing the two
implementations of abs:

   #include <stdio.h>

   static int abs1(int a)
   {
     return a < 0 ? -a : a;
   }

   static int abs2(int a)
   {
     return a < 0 ? -(unsigned)a  : a;
   }

   void bar(int a)
   {
     int foo;

     foo = abs1(a);
     if (foo < 0)
       printf("abs1(%d) < 0n", a);

     foo = abs2(a);
     if (foo < 0)
       printf("abs2(%d) < 0n", a);
   }

and we call the function foo() from an other file:

   #include <limits.h>

   int main(void)
   {
     bar(INT_MIN);
     return 0;
   }

Compiling this with a current gcc such as

   gcc version 4.3.0 20070106 (experimental)

gives the result

   > ./a.out
   abs2(-2147483648) < 0

i.e. gcc manages to eliminate the first if-statement because
it knows that 
"if (abs1(a) < 0)" always is false (or
undefined.  But it do not need to 
care about undefined values as no valid C program may invoke
undefined 
behavior).  But the result of foo2(a) is defined for all
possible values 
of a, and thus gives the expected result.

    /Krister

Re: Documentation of abs(3), div(3) etc.
country flaguser name
South Africa
2007-02-11 01:00:04
On Sat, 10 Feb 2007, Martijn van Buul wrote:
> * Perry E. Metzger:
> > Your comment may be technically correct, but it is
also amazingly
> > impolitely phrased. I suggest that you apologize.
> 
> I must admit that, in retrospect, it was needlessly
harsh and uncalled
> for, so I hereby apologise to Mr. Barrett, and to
anyone else who
> felt offended.

Thank you, but I was not offended.

--apb (Alan Barrett)

[1-9]

about | contact  Other archives ( Real Estate discussion Medical topics )