List Info

Thread: Lucent GBE (4 x VC4) clues needed




Lucent GBE (4 x VC4) clues needed
user name
2006-09-21 13:12:17
(oops technical question in nanog, wearing my asbestos suit)

Consider this topology

GSR - 3750 --(GE over 4xVC4) - NSE100 - NSE100 --(GE over
4xVC4) -- 3550 - GSR

All other fibres are dark fibres, except marked.

When we ping either NSE100 <-> GSR leg, when there is
no background traffic
there is no packet loss. If there is even few Mbps, lets say
10Mbps of 
background traffic we get 1-5% packet loss on 1500 bytes,
and bit
less packet loss on small packets. As background traffic
increases
packet loss quickly increases.

We tried to replace (GSR-3750) with 7600, but same issue
persisted.

We've measured both Lucent GBE legs with having loop in
other end
and pushing tests from EXFO and Smartbits gear through the
loop, 
no errors can be detected in RFC tests.

There isn't very much that can be configured in the Lucent,
and we've
tried pretty much every setting. We've tried to set
autonego on
and off in every gear in the path, without any changes to
observed
behaviour. We've also tried to use use 1xVC4, without any
changes
to the behaviour. All VC4's in given leg are using same
path.
 Even though we test the packet loss pinging from router
link to
router link, same packet loss is experienced for transit
traffic
also. We've tried to turn PXF off in NSE100. Packets
between
NSE100 <-> NSE100 over dark fibre are not lost.

We're pretty much utterly without clues. All I can think
off is
some obscure IFG issue, that is, NSE100 would have less than
perfect timing for IFG which would confuse Lucent regarding
what is part of which frame. Does stuff like this really
happen?

NSE100 drops bad IP packets in PXF and there is only shared
counter, so I can't tell if I get CRC for IP, I just
loose the packets. But IS-IS is not handled in PXF, and
I get %CLNS-4-LSPCKSUM and %CLNS-3-BADPACKET messages
over both Lucent legs, but not between the NSE100's.
So I assume the packets are not dropped, but broken.


I swear next time I'll complain about some political issue,
thanks,
-- 
  ++ytti
Lucent GBE (4 x VC4) clues needed
user name
2006-09-21 13:32:03
> -----Original Message-----
> From: owner-nanogmerit.edu [mailto:owner-nanogmerit.edu] On 
> Behalf Of Saku Ytti
> Sent: Thursday, September 21, 2006 9:12 AM
> To: nanogmerit.edu
> Subject: Lucent GBE (4 x VC4) clues needed
> 
> 
> (oops technical question in nanog, wearing my asbestos
suit)
> 
> Consider this topology
> 
> GSR - 3750 --(GE over 4xVC4) - NSE100 - NSE100 --(GE
over 
> 4xVC4) -- 3550 - GSR
> 
> All other fibres are dark fibres, except marked.
> 
> When we ping either NSE100 <-> GSR leg, when
there is no 
> background traffic there is no packet loss. If there is
even 
> few Mbps, lets say 10Mbps of background traffic we get
1-5% 
> packet loss on 1500 bytes, and bit less packet loss on
small 
> packets. As background traffic increases packet loss
quickly 
> increases.
> 
> We tried to replace (GSR-3750) with 7600, but same
issue persisted.
> 
> We've measured both Lucent GBE legs with having loop
in other 
> end and pushing tests from EXFO and Smartbits gear
through 
> the loop, no errors can be detected in RFC tests.
> 
> There isn't very much that can be configured in the
Lucent, 
> and we've tried pretty much every setting. We've
tried to set 
> autonego on and off in every gear in the path, without
any 
> changes to observed behaviour. We've also tried to use
use 
> 1xVC4, without any changes to the behaviour. All VC4's
in 
> given leg are using same path.
>  Even though we test the packet loss pinging from
router link 
> to router link, same packet loss is experienced for
transit 
> traffic also. We've tried to turn PXF off in NSE100.
Packets 
> between NSE100 <-> NSE100 over dark fibre are not
lost.
> 
> We're pretty much utterly without clues. All I can
think off 
> is some obscure IFG issue, that is, NSE100 would have
less 
> than perfect timing for IFG which would confuse Lucent 
> regarding what is part of which frame. Does stuff like
this 
> really happen?
> 
> NSE100 drops bad IP packets in PXF and there is only
shared 
> counter, so I can't tell if I get CRC for IP, I just
loose 
> the packets. But IS-IS is not handled in PXF, and I get

> %CLNS-4-LSPCKSUM and %CLNS-3-BADPACKET messages over
both 
> Lucent legs, but not between the NSE100's.
> So I assume the packets are not dropped, but broken.
> 
> 
> I swear next time I'll complain about some political
issue, thanks,
> --
>   ++ytti
> 

Silly question (considering that you stated that IS-IS is
borked also,
which is not handled by PXF - but did you try disabling PXF?

There's a reason why Cisco discontinued every product that
"features"
it.  It's broken. 
Lucent GBE (4 x VC4) clues needed
user name
2006-09-21 13:36:42
On (2006-09-21 06:32 -0700), David Temkin wrote:

> > traffic also. We've tried to turn PXF off in
NSE100. Packets 
 
> Silly question (considering that you stated that IS-IS
is borked also,
> which is not handled by PXF - but did you try disabling
PXF?

Not silly question at all, it was just longer mail that many
people
care to read (including me).

> There's a reason why Cisco discontinued every product
that "features"
> it.  It's broken. 

It's not broken, it's just ciscos name for NPU, two PXF's
doesn't mean
they have anything in common, apart being NPU. In essence,
CRS-1 
uses NPU's afaik, of course cisco doesn't call them PXF,
due to
bad publicity. Cooler word for NPU style design is probably
cell processor, makes me feel warm already about my
NSE100's.
Yes, you can design broken NPU, NSE-1 was good example of
that .

Thanks,
-- 
  ++ytti
Lucent GBE (4 x VC4) clues needed
user name
2006-09-21 17:49:21
Saku Ytti wrote:
> (oops technical question in nanog, wearing my asbestos
suit)
> 
> Consider this topology
> 
> GSR - 3750 --(GE over 4xVC4) - NSE100 - NSE100 --(GE
over 4xVC4) -- 3550 - GSR
> 
> All other fibres are dark fibres, except marked.
> 
> When we ping either NSE100 <-> GSR leg, when
there is no background traffic
> there is no packet loss. If there is even few Mbps,
lets say 10Mbps of 
> background traffic we get 1-5% packet loss on 1500
bytes, and bit
> less packet loss on small packets. As background
traffic increases
> packet loss quickly increases.

[SNIP]

> There isn't very much that can be configured in the
Lucent, and we've
> tried pretty much every setting. We've tried to set
autonego on
> and off in every gear in the path, without any changes
to observed
> behaviour. 

Did you try power cycling the Lucents after changing the
auto-neg 
settings? I've seen some broken autoneg implementations in
the past on 
managed media converters that didn't change settings
immediately. It's 
worth a shot as you seem to be all out of other ideas ;)

Sam
Lucent GBE (4 x VC4) clues needed
user name
2006-09-21 18:00:14
On (2006-09-21 18:49 +0100), Sam Stickland wrote:
 
> Did you try power cycling the Lucents after changing
the auto-neg 
> settings? I've seen some broken autoneg
implementations in the past on 
> managed media converters that didn't change settings
immediately. It's 
> worth a shot as you seem to be all out of other ideas
;)

I brought the adjacent ports in IP gear down and up. We
could verify
from management interface to the lucent that autonegotiation
wasn't 
performed after down/up, while we could observe before
down/up that
autonegotiation was marked being done even though we had
configure
cisoc interfaces as 'force-up'. So clearly it needed to
see link 
down/up.
We didn't powercycle lucent, as it would mean bringing down
tens
of 10G waves. But taking the GBE module out/in would have
been
option (three countries are involved, so bit inconvenient,
but
possible). Country A - Country B is one lucent leg.
Country B - Country C is another lucent leg.

Anyhow thanks for the thoughts, any help I can get is much
appreciated . Of course
we have full support agreement
to both vendors, which we probably have to try sooner 
or later, but it'll be long battle on who's problem it
really is.

-- 
  ++ytti
Lucent GBE (4 x VC4) clues needed
user name
2006-10-21 11:03:12
On (2007-09-21 16:12 +0300), Saku Ytti wrote:

> (oops technical question in nanog, wearing my asbestos
suit)
> 
> Consider this topology
> 
> GSR - 3750 --(GE over 4xVC4) - NSE100 - NSE100 --(GE
over 4xVC4) -- 3550 - GSR

This should have been Nortel GBE, not Lucent my bad.

Anyhow, just wanted for sake of archive report that it's the
Nortel
4xVC4 that corrupts packets, it mostly seems to corrupt
source MAC
and always same bits, that is, any L2 will learn mostly same
MAC with
few different vendor codes, we can also see this in
wireshark on
fibresplitter. (It's not limited strictly to source MAC, but
it's
not random by any means)
It's not broken hardware (unless by design), as it can be
seen in both of
the production legs and we've recreated the same problem in
lab.
Most likely software issue in Nortel.

-- 
  ++ytti
Lucent GBE (4 x VC4) clues needed
user name
2006-11-15 10:25:07
> > Consider this topology
> > 
> > GSR - 3750 --(GE over 4xVC4) - NSE100 - NSE100
--(GE over 4xVC4) -- 3550 - GSR
> 
> This should have been Nortel GBE, not Lucent my bad.

My first best guess was right, it was lucent system after
all.

We've now solved the issue, problem is in GBE card in Lucent
in
hardware revision S1:7, which is broken by design. S1:3,
S1:6
work and we should be able to test S1:8 soon, but we expect
it to work also.

Symptoms were that it flipped bits (but not randomly, just
couldn't figure out why certain places saw bit flips) and
calculated new, correct CRC to the ethernet frame, after
it had flipped bit.

-- 
  ++ytti
[1-7]

about | contact  Other archives ( Real Estate discussion Medical topics )