|
List Info
Thread: CMSG_* problems
|
|
| CMSG_* problems |
  Canada |
2007-02-11 00:57:56 |
[I sent this to tech-kern . Someone pointed out that
it should go to
tech-userlevel as well.]
I started looking at some stuff relating to passing access
rights
through AF_LOCAL sockets. This brought me up against the
CMSG_* mess.
I've come to the conclusion that this API is rather broken -
the most
charitable I can be towards it is "not very well
thought out". I'm
writing here both to check my work - to ask whether there's
something
lurking somewhere I missed that means the flaws I see aren't
really
flaws at all - and to have a bash at coming up with
improvements.
Specifically, it seems to me that the only ways to use the
API without
making assumptions not promised by C involve requiring that
the
msg_control buffer be suitably aligned for a struct cmsghdr,
which
basically means that it must be malloc()ed, and malloc()ed
specifically
for the purpose (not a non-initial part of a larger
malloc()ed buffer).
This is because CMSG_DATA is the only provided way to find
out where
the data for a control message lives, but CMSG_DATA works
only if you
give it a struct cmsghdr, and then it works only assuming
the data
follows it in memory the way it does in the control message.
If you
don't assume the control buffer is aligned, and you copy a
struct
cmsghdr's worth of bytes into a struct cmsghdr (which is the
right way
to deal with possible misalignment), then using CMSG_DATA on
that may
generate a pointer past the end of the object, something
not, strictly,
permitted in C. (If the machine's choice of alignment
requirements for
the CMSG_* interface, and struct padding conventions,
collaborate
appropriately, the pointer may be only *just* past the end
of the
object and thus legal, but this is not promised.)
Alternatively, and apparently the way RFC 2292 and the
CMSG_* macros
appear to expect, you can cast your buffer pointer to a
struct cmsghdr
pointer - at which point you must make sure it's aligned.
When I've had to write to the CMSG_* interface, I usually
end up
assuming I can use pointers past the end of an object,
locating the
data with something like
bp + ((char *)CMSG_DATA(&cmh) - (char *)&cmh)
(where cmh is the struct cmsghdr I've copied the header
into).
CMSG_FIRSTHDR and CMSG_NXTHDR have similar problems, because
they too
assume they are being used on structs cmsghdr embedded in a
control
message buffer. Fortunately, there is CMSG_SPACE, which
allows you to
walk the buffer yourself.
Thus, my first question: is the above analysis missing
anything?
If not, my proposal: the creation of macros akin to
CMSG_DATA and
CMSG_NXTHDR which don't return pointers, but instead, take a
struct
cmsghdr and return the distance from its beginning in the
buffer to the
beginning of its data (CMSG_DATA-alike) or next cmsghdr
(CMSG_NXTHDR-alike). For the sake of concreteness, I
suggest
CMSG_DATASKIP and CMSG_NXTSKIP as their names, though I'm by
no means
wedded to those and would cheerfully entertain
alternatives.
The only complication I see is the case where CMSG_NXTHDR
would return
a null pointer. Since my proposed amount-to-skip macro
cannot know
from just the cmsghdr where the input cmsghdr falls in the
buffer (and
indeed it may not be in a buffer yet, as when constructing
messages), I
propose it take only the cmsghdr, not the msghdr, and not do
any checks
for running past the end of the buffer. (In passing, I
think our
current implementation - and the sample implementation given
in the RFC
- is buggy, in that it will fail to return a null pointer if
the last
control message ends exactly at the end of the control
message buffer,
without padding. The > really needs to be >=. Our
implementation also
does not handle a null pointer second argument as specified
in 2292
section 4.3.2.)
/~ The ASCII der Mouse
/ Ribbon Campaign
X Against HTML mouse rodents.montreal.qc.ca
/ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3
27 4B
|
|
| Re: CMSG_* problems |
  United States |
2007-02-11 12:02:23 |
James K. Lowden wrote:
> der Mouse wrote:
>> Specifically, it seems to me that the only ways to
use the API without
>> making assumptions not promised by C involve
requiring that the
>> msg_control buffer be suitably aligned for a struct
cmsghdr, which
>> basically means that it must be malloc()ed, and
malloc()ed specifically
>> for the purpose (not a non-initial part of a larger
malloc()ed buffer).
>
> (Well, there's a lot in that "basically";
there are ways other than malloc
> to ensure alignment.)
>
> Why is it not sufficient to say the API guarantees
suitable alignment?
> Surely it's better to align the buffers when contructed
in such a way that
> they are easily accessible than to burden the client
with possible bus
> errors.
>
> As I read RFC 2292 (today, for the first time) istm
that was the authors'
> intent. The suggested implementations use an
"implementation defined"
> ALIGN() macro.
I can assure you that the author's assumed the buffer used
for control messages
would be properly aligned for deferencing cmsghdr structures
and/or any other
structure in buffer.
--
Matt Thomas email: matt 3am-software.com
3am Software Foundry www: http://3am-software
.com/bio/matt/
Cupertino, CA disclaimer: I avow all knowledge
of this message.
|
|
| Re: CMSG_* problems |
  Canada |
2007-02-11 22:23:45 |
>> [...] requiring that the msg_control buffer be
suitably aligned for
>> a struct cmsghdr, which basically means that it
must be malloc()ed,
> (Well, there's a lot in that "basically";
there are ways other than
> malloc to ensure alignment.)
Not many, unless you're willing to go machine-dependent or
compiler-dependent. Vide infra.
> Why is it not sufficient to say the API guarantees
suitable
> alignment?
It can't. The buffer is provided by the client of the API;
the API
does not have the opportunity to guarantee anything about
it.
You could just say the API *requires* suitable alignment;
that is
probably the easiest "fix", but it really doesn't
fix anything - it
just documents it, leaving the application author holding
the bag, same
as the current mess.
> Surely it's better to align the buffers when contructed
in such a way
> that they are easily accessible than to burden the
client with
> possible bus errors.
Except, that can't really be done. The buffer alignment is
determined
by the code that sets up the struct msghdr (msg_control in
particular)
before calling sendmsg/recvmsg, and is therefore beyond the
control of
the code backing the API.
> As I read RFC 2292 (today, for the first time) istm
that was the
> authors' intent. The suggested implementations use an
> "implementation defined" ALIGN() macro.
Indeed. But they all assume the buffer is *already*
aligned, which is
my point - that ALIGN does nothing but maintain the existing
alignment
(and not even that, if it starts out misaligned on an
architecture that
errors on unaligned accesses). The 2292 interface
completely ignores
the issue of buffer alignment, which is one of the reasons I
think it
is broken - it's usable only on machines with no alignment
constraints,
or with malloc()ed buffers, or with various ugly hacks to
ensure
alignment. (The only way I can think of to ensure alignment
suitable
for a struct cmsghdr, without machine- or compiler-dependent
hackery,
is to allocate the buffer as an array of struct cmsghdr, and
that still
doesn't ensure the buffer is suitably aligned for the data
fields of
the control messages.)
/~ The ASCII der Mouse
/ Ribbon Campaign
X Against HTML mouse rodents.montreal.qc.ca
/ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3
27 4B
|
|
| Re: CMSG_* problems |
  Canada |
2007-02-11 22:42:12 |
> I can assure you that the author's [of RFC2292] assumed
the buffer
> used for control messages would be properly aligned for
deferencing
> cmsghdr structures and/or any other structure in
buffer.
Which is exactly the problem I propose to fix - or, since
that
interface is cast in stone by now, I propose to work around
by
providing an interface permitting applications to process
control
message buffers *withotu* making such assumptions.
It's ugly and fragile to assume such a thing, especially
when it is not
documented anywhere (as far as I can see). Worse, one of
the most
popular architectures (i386) silently patches up unaligned
accesses,
leading app authors to assume their code is fine, only to
get a nasty
surprise upon trying to run it on something like a SPARC
which gets
upset over misaligned accesses.
/~ The ASCII der Mouse
/ Ribbon Campaign
X Against HTML mouse rodents.montreal.qc.ca
/ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3
27 4B
|
|
| Re: CMSG_* problems |
  Canada |
2007-02-12 15:57:39 |
>> Knowing what "the maximally aligned type"
*is* is machine- and/or
>> compiler-dependent.
> You can always do [a union of char, short, etc]
Except there is no guarantee that there aren't other types,
with even
stricter alignments, that you don't know about - and which
can end up
as the types underlying "advertised" types like
socklen_t or ptrdiff_t
that you may well want to use in cmsg data.
In any case, I'm seeing people coming up with more and more
convoluted
and awkward schemes by which application code can manage to
use the
existing interface, if the author is sufficiently wizardly.
This feels
a lot like pushback against my proposal, but nobody has
actually come
right out and said "no, I don't like this", much
less "...and here's
why".
It really seems to me that we should make it as easy as
feasible to
help people write clean code, and playing fast and loose
with pointer
puns in buffers passed through interfaces that don't
document alignment
requirements doesn't qualify.
Am I correct in inferring that people really don't like the
idea of
making the interface easy to use correctly, preferring to
require
application authors to be sufficiently C-wizardly to (a)
realize that
the current macros demand aligned buffers and (b) either
come up with a
way to arrange that, or bite the bullet and arrange to use
malloc?
If so, I'll go away. But if not, can we get on with talking
about what
a better interface might be, instead of getting sidetracked
into coming
up with more and more arcane methods for using the existing
interface
portably (FSVO "portably")?
/~ The ASCII der Mouse
/ Ribbon Campaign
X Against HTML mouse rodents.montreal.qc.ca
/ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3
27 4B
|
|
| Re: CMSG_* problems |
  United States |
2007-02-12 16:32:47 |
der Mouse wrote:
> Am I correct in inferring that people really don't like
the idea of
> making the interface easy to use correctly, preferring
to require
> application authors to be sufficiently C-wizardly to
(a) realize that
> the current macros demand aligned buffers and (b)
either come up with a
> way to arrange that, or bite the bullet and arrange to
use malloc?
Use an array of intmax_t then. That is architecture
dependent.
|
|
| Re: CMSG_* problems |
  Canada |
2007-02-13 01:43:32 |
>> This feels a lot like pushback against my proposal,
but nobody has
>> actually come right out and said "no, I don't
like this", much less
>> "...and here's why".
> Proposals to change RFC-defined interfaces rarely meet
with immediate
> universal acclamation.
True enough, but what I'm seeing feels very different to me
from
something like "regardless of all this, I'm very
hesitant to mess with
something already cast into an RFC". (In which case
I'd point out that
I'm not proposing to change anything specified in 2292;
everything
specified there would, were I to get my way, work just as it
always
did. All I propose doing is adding two additional macros,
to make it
possible to write in a coding style I find significantly
cleaner than
the one the existing macros are designed for.)
> I think you'd agree, assuming it can be made to work,
that iterating
> over a series of pointers is preferable to grabbing
chunks of data at
> offsets.
Consider the example from 2292:
for (cmsgptr = CMSG_FIRSTHDR(&msg); cmsgptr !=
NULL;
cmsgptr = CMSG_NXTHDR(&msg, cmsgptr)) {
if (cmsgptr->cmsg_level == ... &&
cmsgptr->cmsg_type == ... ) {
u_char *ptr;
ptr = CMSG_DATA(cmsgptr);
/* process data pointed to by ptr */
}
}
I would very much prefer to write this as something more
like
dp = mh.msg_control;
o = 0;
while (o < mh.msg_controllen)
{ if (o+sizeof(struct cmsghdr) > mh.msg_controllen)
bcopy(dp+o,&cmh,sizeof(struct cmsghdr));
if (o+cmh.cmsg_len > mh.msg_controllen)
...check cmh.cmsg_level and/or cmh.cmsg_type...
bcopy(dp+o+CMSG_DATASKIP(&cmh),...,
cmh.cmsg_len-CMSG_DATASKIP(&cmh));
...process the copied-out data...
o += CMSG_NXTSKIP(&cmh);
}
(or you can skip the error tests in the loop, since the
suggested
implementation of CMSG_NXTHDR in 2292 effectively does).
Does this mean I agree? I think "no", but I'm not
sure.
Indeed, when I've had to manipulate control data buffers
that's what
I've done, except that I've had to kludge around the lack
of
CMSG_DATASKIP and CMSG_NXTSKIP.
In particular, this can be done in a library routine without
having to
impose alignment restrictions on its client. (And if it
takes a
half-dozen go-rounds for *us* to figure out how to get
alignment right,
how much chance does the typical app author have?
Especially since
"the typical app author" probably writes on the
i386 port, on which you
*don't* have to get alignment right and the code will thus
"work" even
if it's broken?)
> Do you think that assumption is acceptable? Would
documenting and
> relying on intmax_t alignment suffice?
Suffice for what? It would certainly be better than what we
have now,
but given the current lack of documentation that's not
saying much. It
would not suffice for my own use; if that's all that's done,
I'll
continue kludging around the lack of the macros I want the
ways I have
been. They depend on a nonportability (pointers past the
ends of
objects, usually), but I consider them preferable to -
cleaner and far
less fragile than - overlaying the cmsghdr and data onto
the
control-message buffer.
/~ The ASCII der Mouse
/ Ribbon Campaign
X Against HTML mouse rodents.montreal.qc.ca
/ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3
27 4B
|
|
| Re: CMSG_* problems |
  South Africa |
2007-02-13 02:13:24 |
On Mon, 12 Feb 2007, Jason Thorpe wrote:
> On Feb 12, 2007, at 3:27 PM, der Mouse wrote:
>
> >Will be able to contain a pointer, sure. But not
necessarily, will be
> >at least as strictly aligned as a pointer.
>
> You're completely wrong on this one. intmax_t is by
definition at
> least as large and at least as alignment-strict as
intptr_t, which in
> turn is by definition at least as large and at least as
alignment-
> strict as a pointer.
I think Mouse is right. intmax_t is guaranteed to be at
least as large
as any other integer type (C99 committee draft N1124 section
7.18.1.5),
but I don't see any guarantees about the alignment of
intmax_t relative
to the alignment of any other integer type. intptr_t is
guaranteed
to be convertible to and from any pointer (section
7.18.1.4), but I
see no guarantees about the alignment of intptr_t relative
to the
alignment of any pointers, nor even any guarantees about the
size of
intptr_t relative to the size of any pointers (for example,
as Seebs
mentioned in another message, conversion between a pointer
and an
intptr_t could involve removing or inserting some bits that
are known to
have a constant value).
--apb (Alan Barrett)
|
|
| Re: CMSG_* problems |
  South Africa |
2007-02-13 02:21:28 |
On Mon, 12 Feb 2007, Matt Thomas wrote:
> C mandates that alignment for
> char <= short <= int <= long <= long long.
I can't find this requirement. Could you point out the
relevant
part of the C99 or C89 standard?
--apb (Alan Barrett)
|
|
| Re: CMSG_* problems |
  Canada |
2007-02-14 23:39:29 |
> section 5.2.4.2 (Numerical limits) where the minimum
and maximum
> sizes of integral types is specified. It actually says
char < short
> <= int < long < long long. More precisely,
the minimum number of
> bits for each type:
> char 8
> short 16
> int 16
> long 32
> llong 64
Your "more precisely" makes it sound as though the
table is what is
present and the rest is your interpretation.
This table does not, in itself, compel char < short or
int < long or
long < long long. These are *minimums*; for example,
nothing here
prevents long and long long from each being 64 bits - or for
that
matter prevents all five of these types from being 64 bits
wide.
(Other factors prevent that last, actually; int must be
strictly wider
than unsigned char because of the existence of EOF.)
/~ The ASCII der Mouse
/ Ribbon Campaign
X Against HTML mouse rodents.montreal.qc.ca
/ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3
27 4B
|
|
|
|