List Info

Thread: network i/o errors in 3.1 ?




network i/o errors in 3.1 ?
user name
2007-01-26 02:06:33
Hi,
 I upgraded my netbsd-network to 3.1.
Since then I see network input and output errors on my
10MBit/simplex lance ethernet interfaces.
100MBit full duplex hme's are working without input or
output errors.
Unfortunatelly I additionally added a 3com superstack 3300
switch, replacing it's predecessor which has stopped working
durring the upgrade.

I'm unshure how to interpret this.
What are input/output errors? Yes my question is that
basic.
I have a clear figure of colisions, but I don't what is
counted by the other counters.
It is said, that output errors indicate an interface going
bad, but it is unlikely that all my suns le0 divices are
going bad at the same time.

Any Ideas, any pointers?

	Thanks AHA
-- 
NetBSD: If you happen to have any problem with your uptime.


Re: network i/o errors in 3.1 ?
user name
2007-01-27 06:22:39
On Fri, Jan 26, 2007 at 09:06:33AM +0100, Andreas_Hallmann
wrote:
> Hi,
>  I upgraded my netbsd-network to 3.1.
> Since then I see network input and output errors on my
10MBit/simplex lance ethernet interfaces.
> 100MBit full duplex hme's are working without input or
output errors.
> Unfortunatelly I additionally added a 3com superstack
3300 switch, replacing it's predecessor which has stopped
working durring the upgrade.
> 
> I'm unshure how to interpret this.
> What are input/output errors? Yes my question is that
basic.
> I have a clear figure of colisions, but I don't what is
counted by the other counters.
> It is said, that output errors indicate an interface
going bad, but it is unlikely that all my suns le0 divices
are going bad at the same time.
> 
> Any Ideas, any pointers?

Reading the sources, input errors can be one:
- received packet larger than max ethernet frame size
- lack of memory for storing the new packet
- various receive error signaled from the chip, but a
message is logged as
  well

output errors can be:
- device timeout (but then you get a message in logs)
- various transmit errors signaled by the chip, including
exessive collisions.
  Some of them are logged.

You can try
options LEDEBUG
in your kernel config file (you must run make clean before
rebuilding)
to get more informations.

-- 
Manuel Bouyer <bouyerantioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la
difference
--

Re: (lance ethernet) network i/o errors in 3.1 ?
user name
2007-02-07 08:22:26
On Sat, Jan 27, 2007 at 01:22:39PM +0100, Manuel Bouyer
wrote:
> On Fri, Jan 26, 2007 at 09:06:33AM +0100,
Andreas_Hallmann wrote:
> > Hi,
> >  I upgraded my netbsd-network to 3.1.
> > Since then I see network input and output errors
on my 10MBit/simplex lance ethernet interfaces.
> > 100MBit full duplex hme's are working without
input or output errors.
> > Unfortunatelly I additionally added a 3com
superstack 3300 switch, replacing it's predecessor which has
stopped working durring the upgrade.
> > 
> > I'm unshure how to interpret this.
> > What are input/output errors? Yes my question is
that basic.
> > I have a clear figure of colisions, but I don't
what is counted by the other counters.
> > It is said, that output errors indicate an
interface going bad, but it is unlikely that all my suns le0
divices are going bad at the same time.
> > 
> > Any Ideas, any pointers?
> 
> Reading the sources, input errors can be one:
> - received packet larger than max ethernet frame size
> - lack of memory for storing the new packet
> - various receive error signaled from the chip, but a
message is logged as
>   well
> 
> output errors can be:
> - device timeout (but then you get a message in logs)
> - various transmit errors signaled by the chip,
including exessive collisions.
>   Some of them are logged.
> 
> You can try
> options LEDEBUG
> in your kernel config file (you must run make clean
before rebuilding)
> to get more informations.

Hi Manuel,
thanks for the advice.
with 

	options LEDEBUG

After an uptime of > 4 days I get along with many
collisions lots of 

	le0: missed packet

I digged the sources but am stil unshure. Does it meen that
thoose 8 recieve buffers are insuficient?
Can I increase the number of recieve buffers for le0, or are
the part of the lance chip?

Moreover looking a bit beeper on my dmesg, I wonder why my
onboard le0 is bound to sbus0 and my sbus le1 is bound to
ledma0.
Does that meen I have dma support for le1 but not for le0?
A look at an SS5 shows le0 at ledma0, but this machine also
suffers a equivalent Ierrs and Colls rate.

Any hints?

Thankx AHA

----- netstat -in -----
Name  Mtu   Network       Address              Ipkts Ierrs  
 Opkts Oerrs Colls
le0   1500  <Link>        08:00:20:12:b3:f0  4561760  
 24  6353925     2 393287
le0   1500  192.168.48/24 192.168.48.26      4561760    24 
6353925     2 393287
le0   1500  fe80::/64     fe80::a00:20ff:fe  4561760    24 
6353925     2 393287
le0   1500  2001:4b88:100 2001:4b88:1008:0:  4561760    24 
6353925     2 393287
le0   1500  2001:4b88:100 2001:4b88:1008::1  4561760    24 
6353925     2 393287
le1   1500  <Link>        08:00:20:12:b3:f0   862846  
  0   596264     2   566
le1   1500  192.168.47/24 192.168.47.26       862846     0  
596264     2   566
le1   1500  fe80::/64     fe80::a00:20ff:fe   862846     0  
596264     2   566
le1   1500  2001:4b88:100 2001:4b88:1008:1:   862846     0  
596264     2   566

----- dmesg -----
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002,
2003, 2004, 2005
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights
reserved.

NetBSD 3.1_STABLE (AHASS2_SCSI3) #4: Tue Jan 30 15:10:27 CET
2007
	toorkukalda:/export/work/build.objs/v7/3.0/export/netbsd/
netbsd-3-0/src/sys/arch/sparc/compile/AHASS2_SCSI3
total memory = 65432 KB
avail memory = 60504 KB
bootpath: /sbus1,f8000000/esp0,800000/sd3,0
mainbus0 (root): SUNW,Sun 4/75: hostid 554300e2
cpu0 at mainbus0: cache chip bug; trap page uncached:
CY7C601  40 MHz, TMS390C602A FPU
cpu0: 64K byte write-through, 32 bytes/line, hw flush: cache
enabled
memreg0 at mainbus0 ioaddr 0xf4000000
clock0 at mainbus0 ioaddr 0xf2000000: mk48t02
timer0 at mainbus0 ioaddr 0xf3000000 ipl 10: delay constant
17
auxreg0 at mainbus0 ioaddr 0xf7400003
zs0 at mainbus0 ioaddr 0xf1000000 ipl 12 softpri 6
zstty0 at zs0 channel 0 (console i/o)
zstty1 at zs0 channel 1
zs1 at mainbus0 ioaddr 0xf0000000 ipl 12 softpri 6
kbd0 at zs1 channel 0: baud rate 1200
ms0 at zs1 channel 1: baud rate 1200
audioamd0 at mainbus0 ioaddr 0xf7201000 ipl 13 softpri 4
audio0 at audioamd0: full duplex
sbus0 at mainbus0 ioaddr 0xf8000000: clock = 20 MHz
dma0 at sbus0 slot 0 offset 0x400000: DMA rev 1+
esp0 at sbus0 slot 0 offset 0x800000 level 3: ESP100A,
25MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
le0 at sbus0 slot 0 offset 0xc00000 level 4 (ipl 5): address
08:00:20:12:b3:f0
le0: 8 receive buffers, 2 transmit buffers
cgsix at sbus0 slot 1 offset 0x0 level 7 not configured
dma1 at sbus0 slot 3 offset 0x200000: DMA rev 2
esp1 at dma1 slot 3 offset 0x400000 level 3: ESP200, 40MHz,
SCSI ID 7
scsibus1 at esp1: 8 targets, 8 luns per target
ledma0 at sbus0 slot 3 offset 0x200010: DMA rev 2
le1 at ledma0 slot 3 offset 0x600000 level 5: address
08:00:20:12:b3:f0
le1: 8 receive buffers, 2 transmit buffers
bpp0 at sbus0 slot 3 offset 0xc00000 level 2: DMA rev 2
fdc0 at mainbus0 ioaddr 0xf7200000 ipl 11 softpri 4: chip
82072
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
IPsec: Initialized Security Association Processing.
[...]
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le1: lost carrier on UTP port, switching to AUI port
le0: lost carrier
le1: lost carrier on AUI port, switching to UTP port
le0: lost carrier
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
le0: missed packet
------------------------------------------------------------
---------
> 
> -- 
> Manuel Bouyer <bouyerantioche.eu.org>
>      NetBSD: 26 ans d'experience feront toujours la
difference
> --

-- 
NetBSD: If you happen to have any problem with your uptime.


Re: (lance ethernet) network i/o errors in 3.1 ?
country flaguser name
United Kingdom
2007-02-07 15:14:38
On Wed, Feb 07, 2007 at 03:22:26PM +0100, Andreas_Hallmann
wrote:
> 
> After an uptime of > 4 days I get along with many
collisions lots of 
> 
> 	le0: missed packet
> 
> I digged the sources but am stil unshure.
> Does it meen that thoose 8 recieve buffers are
insuficient?
> Can I increase the number of recieve buffers for le0,
> or are the part of the lance chip?

The maximum number of rx buffers will be 128, the number
must be a power of 2.
I don't know exactly how the netbsd driver does things
though!

However I suspect you aren't getting interrupts served often
enough.
 
> Moreover looking a bit beeper on my dmesg, I wonder why
my onboard
> le0 is bound to sbus0 and my sbus le1 is bound to
ledma0.
> Does that meen I have dma support for le1 but not for
le0?

No - the AMD Lance chipset always runs in DMA mode. It
really expects
to acquire the host bus (from the cpu) in order to access
memory.
For the sbus systems the LSI Logic 'DMA' part acts as the
normal bus master
for the on-board bus, and relays transactions from the lance
onto the sbus.
In order to do burst transfers on the sbus (and to get
enough bandwidth)
the DMA part has (IIRC) two 32 byte buffers into which it
pre-fetches TX
data and buffer RX data.  The caching and read-ahead
algorithm is carefully
designed to work with the transfers the lance actually
makes.

The DMA chip has interfaces for the lance, scsi, parallel
port and the
ROM (I think that is all).  There is one on the motherboard
that has all
the onboard devices connected, the sbus ethernet card will
have one of its
own with the other devices absent.

There are several different version of the DMA chip (and I
don't know
which sun motherbaords have which), the early one requires
that ethernet
rx buffers be on a 32 byte boundary (which sucks because
they then need
a software misaligne4d copy).  The DMA2 part handles that
ok.

The other problem is one of sbus latency and priorities. 
Somewhere
there ought to be a register to set the priorities of the
sbus slots
master accesses.  Usually the motherboard port is high
priority, and the
other low priority.  Under heavy load (and probably
requiring 2 dual cpu
modules) the sbus devices can get starved of accesses to
main memory.
This causes the lance to timeout its bus transfers - which
exposes some
silicon bugs in the nmos lnace itself :-(
However you aren't seeing those because the system tends to
lock solid
on the next access to the lance!
(And I don't remember seeing any of the required code in the
netbsd le
or ledma driver.)

	David

-- 
David Laight: davidl8s.co.uk

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )