List Info

Thread: Losing BPF's




Losing BPF's
country flaguser name
Canada
2007-02-16 08:32:56
Hello

I have found a potential bug in libpcap on OpenBSD and
likely FreeBSD as
well. If you simultaneously open several programs that open
pcap
connections, you can cause the system to lose track of some
of its
BPF's. When you close all the pcap connections some of the
BPF's may
report that they are still busy.

You can reproduce this problem by doing the following:

1. Get a copy of bpfmaker.c and bpfMaker.pl that have been
attched.
2. run `gcc bpfmaker.c -o bpfmaker -lpcap`
3. run `perl bfpmaker.pl <name of your ethernet card>

This will simultaneously open several copies of bpfmaker and
close them
in a loop and lets you compare the number of bpf's being
used on your
system at the start and end of the program.

I think the problem is that two programs simultaneously
opening a live
pcap connection may cause two bpf device files to point to
the same BPF
in the OS, or two BPF's in the OS to point to the same
device file. So
when one connection is closed, the other one will no longer
properly
clean up when it releases the device file in pcap_close().

To fix the problem I have placed a file lock around the
following code
in pcap_open_live() in the file pcap-bpf.c attached:

    fd = bpf_open(p, ebuf);
    if (fd < 0)
        goto bad;

    p->fd = fd;
    p->snapshot = snaplen;

    if (ioctl(fd, BIOCVERSION, (caddr_t)&bv) < 0) {
        snprintf(ebuf, PCAP_ERRBUF_SIZE, "BIOCVERSION:
%s",
            pcap_strerror(errno));
        goto bad;
    }
    if (bv.bv_major != BPF_MAJOR_VERSION ||
        bv.bv_minor < BPF_MINOR_VERSION) {
        snprintf(ebuf, PCAP_ERRBUF_SIZE,
            "kernel bpf filter out of date");
        goto bad;
    }

    /*
     * Try finding a good size for the buffer; 32768 may be
too
     * big, so keep cutting it in half until we find a size
     * that works, or run out of sizes to try.  If the
default
     * is larger, don't make it smaller.
     *
     * XXX - there should be a user-accessible hook to set
the
     * initial buffer size.
     */
    if ((ioctl(fd, BIOCGBLEN, (caddr_t)&v) < 0) || v
< 32768)
        v = 32768;
    for ( ; v != 0; v >>= 1) {
        /* Ignore the return value - this is because the
call fails
         * on BPF systems that don't have kernel malloc. 
And if
         * the call fails, it's no big deal, we just
continue to
         * use the standard buffer size.
         */
        (void) ioctl(fd, BIOCSBLEN, (caddr_t)&v);

        (void)strncpy(ifr.ifr_name, device,
sizeof(ifr.ifr_name));
        if (ioctl(fd, BIOCSETIF, (caddr_t)&ifr) >=
0)
            break;    /* that size worked; we're done */

        if (errno != ENOBUFS) {
            snprintf(ebuf, PCAP_ERRBUF_SIZE,
"BIOCSETIF: %s: %s",
                device, pcap_strerror(errno));
            goto bad;
        }
    }



If you think that this is an acceptable change, I can clean
up the code
to the standard coding conventions used in the project and
do proper
error reporting and submit that code for approval. I dont
know exactly
how this process works.

Thanks

Jonathan Steel

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

  
  
  
Re: Losing BPF's
country flaguser name
United States
2007-02-19 02:10:13
Jon Steel wrote:

> I have found a potential bug in libpcap on OpenBSD and
likely FreeBSD as
> well. If you simultaneously open several programs that
open pcap
> connections, you can cause the system to lose track of
some of its
> BPF's. When you close all the pcap connections some of
the BPF's may
> report that they are still busy.
> 
> You can reproduce this problem by doing the following:
> 
> 1. Get a copy of bpfmaker.c and bpfMaker.pl that have
been attched.
> 2. run `gcc bpfmaker.c -o bpfmaker -lpcap`
> 3. run `perl bfpmaker.pl <name of your ethernet
card>
> 
> This will simultaneously open several copies of
bpfmaker and close them
> in a loop and lets you compare the number of bpf's
being used on your
> system at the start and end of the program.
> 
> I think the problem is that two programs simultaneously
opening a live
> pcap connection may cause two bpf device files to point
to the same BPF
> in the OS,

If so, that's an OS bug.  /dev/bpfN is supposed to open the
Nth BPF 
device; a quick look at the OpenBSD bpfopen() appears to
indicate that 
it does do that:

int
bpfopen(dev_t dev, int flag, int mode, struct proc *p)
{
         struct bpf_d *d;

         /* create on demand */
         if ((d = bpfilter_create(minor(dev))) == NULL)
                 return (ENXIO);
         /*
          * Each minor can be opened by only one process. 
If the requested
          * minor is in use, return EBUSY.
          */
         if (!D_ISFREE(d))
                 return (EBUSY);

         /* Mark "free" and do most
initialization. */
         d->bd_bufsize = bpf_bufsize;
         d->bd_sig = SIGIO;

         D_GET(d);

         return (0);
}

	...

struct bpf_d *
bpfilter_lookup(int unit)
{
         struct bpf_d *bd;

         LIST_FOREACH(bd, &bpf_d_list, bd_list)
                 if (bd->bd_unit == unit)
                         return (bd);
         return (NULL);
}

struct bpf_d *
bpfilter_create(int unit)
{
         struct bpf_d *bd;

         if ((bd = bpfilter_lookup(unit)) != NULL)
                 return (bd);
         if ((bd = malloc(sizeof(*bd), M_DEVBUF, M_NOWAIT))
!= NULL) {
                 bzero(bd, sizeof(*bd));
                 bd->bd_unit = unit;
                 D_MARKFREE(bd);
                 LIST_INSERT_HEAD(&bpf_d_list, bd,
bd_list);
         }
         return (bd);
}

> or two BPF's in the OS to point to the same device
file.

If by "device file" you mean /dev/bpfN file, if
that happened, that 
would also be a bug.  I don't see how that could happen,
given the 
OpenBSD BPF code.

I can't reproduce this on OS X 10.4 - I get

	$ sudo ./bpfMaker.pl en1
	BPF's at startup:        0
	BPF's upon ending:        0

so it's not inherent to BPF (10.4's libpcap doesn't do any
file locking 
on BPF devices - it relies on the opens being exclusive use,
just as the 
libpcap on {Free,Net,Open,DragonFly}BSD do).

Are you certain that the loop in your script isn't missing
any BPF 
processes, so that some are left running?
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: Losing BPF's
country flaguser name
United States
2007-02-19 02:18:40
Guy Harris wrote:

> I can't reproduce this on OS X 10.4 - I get
> 
>     $ sudo ./bpfMaker.pl en1
>     BPF's at startup:        0
>     BPF's upon ending:        0

...with a version of bpftest.c fixed so that, if
pcap_open_live() fails, 
it returns before calling pcap_loop() (otherwise, it dumps
core, which 
takes a significant amount of time on OS X - OS X core files
are huge).

I've attached the source to that version.

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: Losing BPF's
country flaguser name
Canada
2007-02-19 15:12:49
I did some more digging and I think Ive narrowed the problem
down a bit
more. It does appear to be a kernel issue. pcaps off the
hook for today.

For those interested, the problem occurs for the following
reasons: When
you call open() on OpenBSD it does not lock the file unless
you tell it
to. This means multiple pcap connections can get a file
descriptor for
the same /dev/bpfN file. The problem then occurs when on the
following
line in which comes soon after opening the file:

ioctl(fd, BIOCSETIF, (caddr_t)&ifr);

Now the kernel gets confused somewhere because it has
multiple
connections pointing to the same /dev/bpfN file and so it
cannot be
closed properly.


The ioctl call will end up calling the below function in the
kernel
(where the problem should lie). Ill move this over to a post
Ive put up
on the OpenBSD message board.

Thanks for your help

958 /*
959  * Detach a file from its current interface (if attached
at all) and
attach
960  * to the interface indicated by the name stored in
ifr.
961  * Return an errno or 0.
962  */
963 int
964 bpf_setif(struct bpf_d *d, struct ifreq *ifr)
965 {
966         struct bpf_if *bp, *candidate = NULL;
967         int s, error;
968
969         /*
970          * Look through attached interfaces for the
named one.
971          */
972         for (bp = bpf_iflist; bp != 0; bp =
bp->bif_next) {
973                 struct ifnet *ifp = bp->bif_ifp;
974
975                 if (ifp == 0 ||
976                     strcmp(ifp->if_xname,
ifr->ifr_name) != 0)
977                         continue;
978
979                 /*
980                  * We found the requested interface.
981                  */
982                 if (candidate == NULL ||
candidate->bif_dlt >
bp->bif_dlt)
983                         candidate = bp;
984         }
985
986         if (candidate != NULL) {
987                 /*
988                  * Allocate the packet buffers if we
need to.
989                  * If we're already attached to
requested interface,
990                  * just flush the buffer.
991                  */
992                 if (d->bd_sbuf == 0) {
993                         error = bpf_allocbufs(d);
994                         if (error != 0)
995                                 return (error);
996                 }
997                 s = splnet();
998                 if (candidate != d->bd_bif) {
999                         if (d->bd_bif)
1000                                 /*
1001                                  * Detach if attached
to something
else.
1002                                  */
1003                                 bpf_detachd(d);
1004
1005                         bpf_attachd(d, candidate);
1006                 }
1007                 bpf_reset_d(d);
1008                 splx(s);
1009                 return (0);
1010         }
1011         /* Not found. */
1012         return (ENXIO);
1013 }

Guy Harris wrote:
> Jon Steel wrote:
>
>> I have found a potential bug in libpcap on OpenBSD
and likely FreeBSD as
>> well. If you simultaneously open several programs
that open pcap
>> connections, you can cause the system to lose track
of some of its
>> BPF's. When you close all the pcap connections some
of the BPF's may
>> report that they are still busy.
>>
>> You can reproduce this problem by doing the
following:
>>
>> 1. Get a copy of bpfmaker.c and bpfMaker.pl that
have been attched.
>> 2. run `gcc bpfmaker.c -o bpfmaker -lpcap`
>> 3. run `perl bfpmaker.pl <name of your ethernet
card>
>>
>> This will simultaneously open several copies of
bpfmaker and close them
>> in a loop and lets you compare the number of bpf's
being used on your
>> system at the start and end of the program.
>>
>> I think the problem is that two programs
simultaneously opening a live
>> pcap connection may cause two bpf device files to
point to the same BPF
>> in the OS,
>
> If so, that's an OS bug.  /dev/bpfN is supposed to open
the Nth BPF
> device; a quick look at the OpenBSD bpfopen() appears
to indicate that
> it does do that:
>
> int
> bpfopen(dev_t dev, int flag, int mode, struct proc *p)
> {
>         struct bpf_d *d;
>
>         /* create on demand */
>         if ((d = bpfilter_create(minor(dev))) == NULL)
>                 return (ENXIO);
>         /*
>          * Each minor can be opened by only one
process.  If the
> requested
>          * minor is in use, return EBUSY.
>          */
>         if (!D_ISFREE(d))
>                 return (EBUSY);
>
>         /* Mark "free" and do most
initialization. */
>         d->bd_bufsize = bpf_bufsize;
>         d->bd_sig = SIGIO;
>
>         D_GET(d);
>
>         return (0);
> }
>
>     ...
>
> struct bpf_d *
> bpfilter_lookup(int unit)
> {
>         struct bpf_d *bd;
>
>         LIST_FOREACH(bd, &bpf_d_list, bd_list)
>                 if (bd->bd_unit == unit)
>                         return (bd);
>         return (NULL);
> }
>
> struct bpf_d *
> bpfilter_create(int unit)
> {
>         struct bpf_d *bd;
>
>         if ((bd = bpfilter_lookup(unit)) != NULL)
>                 return (bd);
>         if ((bd = malloc(sizeof(*bd), M_DEVBUF,
M_NOWAIT)) != NULL) {
>                 bzero(bd, sizeof(*bd));
>                 bd->bd_unit = unit;
>                 D_MARKFREE(bd);
>                 LIST_INSERT_HEAD(&bpf_d_list, bd,
bd_list);
>         }
>         return (bd);
> }
>
>> or two BPF's in the OS to point to the same device
file.
>
> If by "device file" you mean /dev/bpfN file,
if that happened, that
> would also be a bug.  I don't see how that could
happen, given the
> OpenBSD BPF code.
>
> I can't reproduce this on OS X 10.4 - I get
>
>     $ sudo ./bpfMaker.pl en1
>     BPF's at startup:        0
>     BPF's upon ending:        0
>
> so it's not inherent to BPF (10.4's libpcap doesn't do
any file
> locking on BPF devices - it relies on the opens being
exclusive use,
> just as the libpcap on {Free,Net,Open,DragonFly}BSD
do).
>
> Are you certain that the loop in your script isn't
missing any BPF
> processes, so that some are left running?
> -
> This is the tcpdump-workers list.
> Visit https://cod.sandelman.ca/ to unsubscribe.

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: Losing BPF's
country flaguser name
United States
2007-02-19 15:49:38
Jon Steel wrote:
> I did some more digging and I think Ive narrowed the
problem down a bit
> more. It does appear to be a kernel issue. pcaps off
the hook for today.
> 
> For those interested, the problem occurs for the
following reasons: When
> you call open() on OpenBSD it does not lock the file
unless you tell it
> to. This means multiple pcap connections can get a file
descriptor for
> the same /dev/bpfN file.

When you call open() on a special file (such as /dev/bpfN)
on OpenBSD, 
does it call the driver's open routine, even if the device
is already open?

If so, then that open routine:

int
bpfopen(dev_t dev, int flag, int mode, struct proc *p)
{
         struct bpf_d *d;

         /* create on demand */
         if ((d = bpfilter_create(minor(dev))) == NULL)
                 return (ENXIO);
         /*
          * Each minor can be opened by only one process. 
If the requested
          * minor is in use, return EBUSY.
          */
         if (!D_ISFREE(d))
                 return (EBUSY);

         /* Mark "free" and do most
initialization. */
         d->bd_bufsize = bpf_bufsize;
         d->bd_sig = SIGIO;

         D_GET(d);

         return (0);
}

would fail with EBUSY if the device is already open.

sys_open() (for the open() system call), calls vn_open() on
all opens, 
which calls VOP_OPEN() on all opens.  For special files,
that should go 
to spec_open(), which appears to call the driver open
routine 
(*cdevsw[maj].d_open) for all opens.

Thus, there should be no need to lock the device; the BPF
driver should, 
in effect, be doing it for you.  That's the way BPF has
worked on BSD 
for quite a while (since it was first introduced, I think). 
Libpcap 
explicitly requires that, as do other users of BPF - they
keep opening 
successive BPF devices until they get one that they can
successfully open.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: Losing BPF's
country flaguser name
India
2007-02-19 17:16:14
> If so, then that open routine:
>
> int
> bpfopen(dev_t dev, int flag, int mode, struct proc *p)
> {
>         struct bpf_d *d;
>
>         /* create on demand */
>         if ((d = bpfilter_create(minor(dev))) == NULL)
>                 return (ENXIO);
>         /*
>          * Each minor can be opened by only one
process.  If the 
> requested
>          * minor is in use, return EBUSY.
>          */
>         if (!D_ISFREE(d))
>                 return (EBUSY);
>
>         /* Mark "free" and do most
initialization. */
>         d->bd_bufsize = bpf_bufsize;
>         d->bd_sig = SIGIO;
>
>         D_GET(d);
>
>         return (0);
> }
>
> would fail with EBUSY if the device is already open.
There seems to be a race condition in the above code, the
check for the 
descriptor being free and the call to D_GET to mark the
descriptor as 
being used is not atomic. So two closely spaced calls to
bpfopen could 
cause bpf to use the same device twice ?


regards
maneesh

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: Losing BPF's
country flaguser name
United States
2007-02-20 13:20:13
maneeshs wrote:

> There seems to be a race condition in the above code,
the check for the 
> descriptor being free and the call to D_GET to mark the
descriptor as 
> being used is not atomic. So two closely spaced calls
to bpfopen could 
> cause bpf to use the same device twice ?

Yes, if, in OpenBSD, either

	1) you're on a multi-processor machine

or

	2) the kernel is preemptible

and there's no big lock around the kernel or around opens or
around 
special-file opens or something else above the BPF open
routine.

FreeBSD has a single global BPF mutex (bpf_mtx).
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

[1-7]

about | contact  Other archives ( Real Estate discussion Medical topics )