|
List Info
Thread: Losing BPF's
|
|
| Losing BPF's |
  Canada |
2007-02-16 08:32:56 |
Hello
I have found a potential bug in libpcap on OpenBSD and
likely FreeBSD as
well. If you simultaneously open several programs that open
pcap
connections, you can cause the system to lose track of some
of its
BPF's. When you close all the pcap connections some of the
BPF's may
report that they are still busy.
You can reproduce this problem by doing the following:
1. Get a copy of bpfmaker.c and bpfMaker.pl that have been
attched.
2. run `gcc bpfmaker.c -o bpfmaker -lpcap`
3. run `perl bfpmaker.pl <name of your ethernet card>
This will simultaneously open several copies of bpfmaker and
close them
in a loop and lets you compare the number of bpf's being
used on your
system at the start and end of the program.
I think the problem is that two programs simultaneously
opening a live
pcap connection may cause two bpf device files to point to
the same BPF
in the OS, or two BPF's in the OS to point to the same
device file. So
when one connection is closed, the other one will no longer
properly
clean up when it releases the device file in pcap_close().
To fix the problem I have placed a file lock around the
following code
in pcap_open_live() in the file pcap-bpf.c attached:
fd = bpf_open(p, ebuf);
if (fd < 0)
goto bad;
p->fd = fd;
p->snapshot = snaplen;
if (ioctl(fd, BIOCVERSION, (caddr_t)&bv) < 0) {
snprintf(ebuf, PCAP_ERRBUF_SIZE, "BIOCVERSION:
%s",
pcap_strerror(errno));
goto bad;
}
if (bv.bv_major != BPF_MAJOR_VERSION ||
bv.bv_minor < BPF_MINOR_VERSION) {
snprintf(ebuf, PCAP_ERRBUF_SIZE,
"kernel bpf filter out of date");
goto bad;
}
/*
* Try finding a good size for the buffer; 32768 may be
too
* big, so keep cutting it in half until we find a size
* that works, or run out of sizes to try. If the
default
* is larger, don't make it smaller.
*
* XXX - there should be a user-accessible hook to set
the
* initial buffer size.
*/
if ((ioctl(fd, BIOCGBLEN, (caddr_t)&v) < 0) || v
< 32768)
v = 32768;
for ( ; v != 0; v >>= 1) {
/* Ignore the return value - this is because the
call fails
* on BPF systems that don't have kernel malloc.
And if
* the call fails, it's no big deal, we just
continue to
* use the standard buffer size.
*/
(void) ioctl(fd, BIOCSBLEN, (caddr_t)&v);
(void)strncpy(ifr.ifr_name, device,
sizeof(ifr.ifr_name));
if (ioctl(fd, BIOCSETIF, (caddr_t)&ifr) >=
0)
break; /* that size worked; we're done */
if (errno != ENOBUFS) {
snprintf(ebuf, PCAP_ERRBUF_SIZE,
"BIOCSETIF: %s: %s",
device, pcap_strerror(errno));
goto bad;
}
}
If you think that this is an acceptable change, I can clean
up the code
to the standard coding conventions used in the project and
do proper
error reporting and submit that code for approval. I dont
know exactly
how this process works.
Thanks
Jonathan Steel
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
|
|
|
| Re: Losing BPF's |
  United States |
2007-02-19 02:10:13 |
Jon Steel wrote:
> I have found a potential bug in libpcap on OpenBSD and
likely FreeBSD as
> well. If you simultaneously open several programs that
open pcap
> connections, you can cause the system to lose track of
some of its
> BPF's. When you close all the pcap connections some of
the BPF's may
> report that they are still busy.
>
> You can reproduce this problem by doing the following:
>
> 1. Get a copy of bpfmaker.c and bpfMaker.pl that have
been attched.
> 2. run `gcc bpfmaker.c -o bpfmaker -lpcap`
> 3. run `perl bfpmaker.pl <name of your ethernet
card>
>
> This will simultaneously open several copies of
bpfmaker and close them
> in a loop and lets you compare the number of bpf's
being used on your
> system at the start and end of the program.
>
> I think the problem is that two programs simultaneously
opening a live
> pcap connection may cause two bpf device files to point
to the same BPF
> in the OS,
If so, that's an OS bug. /dev/bpfN is supposed to open the
Nth BPF
device; a quick look at the OpenBSD bpfopen() appears to
indicate that
it does do that:
int
bpfopen(dev_t dev, int flag, int mode, struct proc *p)
{
struct bpf_d *d;
/* create on demand */
if ((d = bpfilter_create(minor(dev))) == NULL)
return (ENXIO);
/*
* Each minor can be opened by only one process.
If the requested
* minor is in use, return EBUSY.
*/
if (!D_ISFREE(d))
return (EBUSY);
/* Mark "free" and do most
initialization. */
d->bd_bufsize = bpf_bufsize;
d->bd_sig = SIGIO;
D_GET(d);
return (0);
}
...
struct bpf_d *
bpfilter_lookup(int unit)
{
struct bpf_d *bd;
LIST_FOREACH(bd, &bpf_d_list, bd_list)
if (bd->bd_unit == unit)
return (bd);
return (NULL);
}
struct bpf_d *
bpfilter_create(int unit)
{
struct bpf_d *bd;
if ((bd = bpfilter_lookup(unit)) != NULL)
return (bd);
if ((bd = malloc(sizeof(*bd), M_DEVBUF, M_NOWAIT))
!= NULL) {
bzero(bd, sizeof(*bd));
bd->bd_unit = unit;
D_MARKFREE(bd);
LIST_INSERT_HEAD(&bpf_d_list, bd,
bd_list);
}
return (bd);
}
> or two BPF's in the OS to point to the same device
file.
If by "device file" you mean /dev/bpfN file, if
that happened, that
would also be a bug. I don't see how that could happen,
given the
OpenBSD BPF code.
I can't reproduce this on OS X 10.4 - I get
$ sudo ./bpfMaker.pl en1
BPF's at startup: 0
BPF's upon ending: 0
so it's not inherent to BPF (10.4's libpcap doesn't do any
file locking
on BPF devices - it relies on the opens being exclusive use,
just as the
libpcap on {Free,Net,Open,DragonFly}BSD do).
Are you certain that the loop in your script isn't missing
any BPF
processes, so that some are left running?
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
| Re: Losing BPF's |
  United States |
2007-02-19 02:18:40 |
Guy Harris wrote:
> I can't reproduce this on OS X 10.4 - I get
>
> $ sudo ./bpfMaker.pl en1
> BPF's at startup: 0
> BPF's upon ending: 0
...with a version of bpftest.c fixed so that, if
pcap_open_live() fails,
it returns before calling pcap_loop() (otherwise, it dumps
core, which
takes a significant amount of time on OS X - OS X core files
are huge).
I've attached the source to that version.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
| Re: Losing BPF's |
  Canada |
2007-02-19 15:12:49 |
I did some more digging and I think Ive narrowed the problem
down a bit
more. It does appear to be a kernel issue. pcaps off the
hook for today.
For those interested, the problem occurs for the following
reasons: When
you call open() on OpenBSD it does not lock the file unless
you tell it
to. This means multiple pcap connections can get a file
descriptor for
the same /dev/bpfN file. The problem then occurs when on the
following
line in which comes soon after opening the file:
ioctl(fd, BIOCSETIF, (caddr_t)&ifr);
Now the kernel gets confused somewhere because it has
multiple
connections pointing to the same /dev/bpfN file and so it
cannot be
closed properly.
The ioctl call will end up calling the below function in the
kernel
(where the problem should lie). Ill move this over to a post
Ive put up
on the OpenBSD message board.
Thanks for your help
958 /*
959 * Detach a file from its current interface (if attached
at all) and
attach
960 * to the interface indicated by the name stored in
ifr.
961 * Return an errno or 0.
962 */
963 int
964 bpf_setif(struct bpf_d *d, struct ifreq *ifr)
965 {
966 struct bpf_if *bp, *candidate = NULL;
967 int s, error;
968
969 /*
970 * Look through attached interfaces for the
named one.
971 */
972 for (bp = bpf_iflist; bp != 0; bp =
bp->bif_next) {
973 struct ifnet *ifp = bp->bif_ifp;
974
975 if (ifp == 0 ||
976 strcmp(ifp->if_xname,
ifr->ifr_name) != 0)
977 continue;
978
979 /*
980 * We found the requested interface.
981 */
982 if (candidate == NULL ||
candidate->bif_dlt >
bp->bif_dlt)
983 candidate = bp;
984 }
985
986 if (candidate != NULL) {
987 /*
988 * Allocate the packet buffers if we
need to.
989 * If we're already attached to
requested interface,
990 * just flush the buffer.
991 */
992 if (d->bd_sbuf == 0) {
993 error = bpf_allocbufs(d);
994 if (error != 0)
995 return (error);
996 }
997 s = splnet();
998 if (candidate != d->bd_bif) {
999 if (d->bd_bif)
1000 /*
1001 * Detach if attached
to something
else.
1002 */
1003 bpf_detachd(d);
1004
1005 bpf_attachd(d, candidate);
1006 }
1007 bpf_reset_d(d);
1008 splx(s);
1009 return (0);
1010 }
1011 /* Not found. */
1012 return (ENXIO);
1013 }
Guy Harris wrote:
> Jon Steel wrote:
>
>> I have found a potential bug in libpcap on OpenBSD
and likely FreeBSD as
>> well. If you simultaneously open several programs
that open pcap
>> connections, you can cause the system to lose track
of some of its
>> BPF's. When you close all the pcap connections some
of the BPF's may
>> report that they are still busy.
>>
>> You can reproduce this problem by doing the
following:
>>
>> 1. Get a copy of bpfmaker.c and bpfMaker.pl that
have been attched.
>> 2. run `gcc bpfmaker.c -o bpfmaker -lpcap`
>> 3. run `perl bfpmaker.pl <name of your ethernet
card>
>>
>> This will simultaneously open several copies of
bpfmaker and close them
>> in a loop and lets you compare the number of bpf's
being used on your
>> system at the start and end of the program.
>>
>> I think the problem is that two programs
simultaneously opening a live
>> pcap connection may cause two bpf device files to
point to the same BPF
>> in the OS,
>
> If so, that's an OS bug. /dev/bpfN is supposed to open
the Nth BPF
> device; a quick look at the OpenBSD bpfopen() appears
to indicate that
> it does do that:
>
> int
> bpfopen(dev_t dev, int flag, int mode, struct proc *p)
> {
> struct bpf_d *d;
>
> /* create on demand */
> if ((d = bpfilter_create(minor(dev))) == NULL)
> return (ENXIO);
> /*
> * Each minor can be opened by only one
process. If the
> requested
> * minor is in use, return EBUSY.
> */
> if (!D_ISFREE(d))
> return (EBUSY);
>
> /* Mark "free" and do most
initialization. */
> d->bd_bufsize = bpf_bufsize;
> d->bd_sig = SIGIO;
>
> D_GET(d);
>
> return (0);
> }
>
> ...
>
> struct bpf_d *
> bpfilter_lookup(int unit)
> {
> struct bpf_d *bd;
>
> LIST_FOREACH(bd, &bpf_d_list, bd_list)
> if (bd->bd_unit == unit)
> return (bd);
> return (NULL);
> }
>
> struct bpf_d *
> bpfilter_create(int unit)
> {
> struct bpf_d *bd;
>
> if ((bd = bpfilter_lookup(unit)) != NULL)
> return (bd);
> if ((bd = malloc(sizeof(*bd), M_DEVBUF,
M_NOWAIT)) != NULL) {
> bzero(bd, sizeof(*bd));
> bd->bd_unit = unit;
> D_MARKFREE(bd);
> LIST_INSERT_HEAD(&bpf_d_list, bd,
bd_list);
> }
> return (bd);
> }
>
>> or two BPF's in the OS to point to the same device
file.
>
> If by "device file" you mean /dev/bpfN file,
if that happened, that
> would also be a bug. I don't see how that could
happen, given the
> OpenBSD BPF code.
>
> I can't reproduce this on OS X 10.4 - I get
>
> $ sudo ./bpfMaker.pl en1
> BPF's at startup: 0
> BPF's upon ending: 0
>
> so it's not inherent to BPF (10.4's libpcap doesn't do
any file
> locking on BPF devices - it relies on the opens being
exclusive use,
> just as the libpcap on {Free,Net,Open,DragonFly}BSD
do).
>
> Are you certain that the loop in your script isn't
missing any BPF
> processes, so that some are left running?
> -
> This is the tcpdump-workers list.
> Visit https://cod.sandelman.ca/
a> to unsubscribe.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
| Re: Losing BPF's |
  United States |
2007-02-19 15:49:38 |
Jon Steel wrote:
> I did some more digging and I think Ive narrowed the
problem down a bit
> more. It does appear to be a kernel issue. pcaps off
the hook for today.
>
> For those interested, the problem occurs for the
following reasons: When
> you call open() on OpenBSD it does not lock the file
unless you tell it
> to. This means multiple pcap connections can get a file
descriptor for
> the same /dev/bpfN file.
When you call open() on a special file (such as /dev/bpfN)
on OpenBSD,
does it call the driver's open routine, even if the device
is already open?
If so, then that open routine:
int
bpfopen(dev_t dev, int flag, int mode, struct proc *p)
{
struct bpf_d *d;
/* create on demand */
if ((d = bpfilter_create(minor(dev))) == NULL)
return (ENXIO);
/*
* Each minor can be opened by only one process.
If the requested
* minor is in use, return EBUSY.
*/
if (!D_ISFREE(d))
return (EBUSY);
/* Mark "free" and do most
initialization. */
d->bd_bufsize = bpf_bufsize;
d->bd_sig = SIGIO;
D_GET(d);
return (0);
}
would fail with EBUSY if the device is already open.
sys_open() (for the open() system call), calls vn_open() on
all opens,
which calls VOP_OPEN() on all opens. For special files,
that should go
to spec_open(), which appears to call the driver open
routine
(*cdevsw[maj].d_open) for all opens.
Thus, there should be no need to lock the device; the BPF
driver should,
in effect, be doing it for you. That's the way BPF has
worked on BSD
for quite a while (since it was first introduced, I think).
Libpcap
explicitly requires that, as do other users of BPF - they
keep opening
successive BPF devices until they get one that they can
successfully open.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
| Re: Losing BPF's |
  India |
2007-02-19 17:16:14 |
> If so, then that open routine:
>
> int
> bpfopen(dev_t dev, int flag, int mode, struct proc *p)
> {
> struct bpf_d *d;
>
> /* create on demand */
> if ((d = bpfilter_create(minor(dev))) == NULL)
> return (ENXIO);
> /*
> * Each minor can be opened by only one
process. If the
> requested
> * minor is in use, return EBUSY.
> */
> if (!D_ISFREE(d))
> return (EBUSY);
>
> /* Mark "free" and do most
initialization. */
> d->bd_bufsize = bpf_bufsize;
> d->bd_sig = SIGIO;
>
> D_GET(d);
>
> return (0);
> }
>
> would fail with EBUSY if the device is already open.
There seems to be a race condition in the above code, the
check for the
descriptor being free and the call to D_GET to mark the
descriptor as
being used is not atomic. So two closely spaced calls to
bpfopen could
cause bpf to use the same device twice ?
regards
maneesh
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
| Re: Losing BPF's |
  United States |
2007-02-20 13:20:13 |
maneeshs wrote:
> There seems to be a race condition in the above code,
the check for the
> descriptor being free and the call to D_GET to mark the
descriptor as
> being used is not atomic. So two closely spaced calls
to bpfopen could
> cause bpf to use the same device twice ?
Yes, if, in OpenBSD, either
1) you're on a multi-processor machine
or
2) the kernel is preemptible
and there's no big lock around the kernel or around opens or
around
special-file opens or something else above the BPF open
routine.
FreeBSD has a single global BPF mutex (bpf_mtx).
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/
a> to unsubscribe.
|
|
[1-7]
|
|