List Info

Thread: MTU changes affecting BGP sessions




MTU changes affecting BGP sessions
user name
2006-07-26 14:14:43
Hi everybody,

  I´d like to know if someone has seen the following issue.

I´ve got a M320 (Junos 7.5R2.8) with a few BGP peerings.
These peerings 
run on vlans over a L2 infrastructure built on Ethernet
switches (Cisco 
and Foundry). Some of these switches are configured to
support jumbo 
frames, around 9000 bytes, some are not yet.

Then I changed physical MTU in the M320 GigE interface to
9000 bytes.  
Most  peerings just flapped and came up fine, but one
started to flap 
continuously. This one stays up for about 90seg (bgp
holdtime?) and goes 
down again.

I´ve tried "mtu-discovery" bgp command on my
side, but didn´t help.  No 
clue about the other side, so far.

Did anyone see this kind of thing? Hints about a solution?

Another interesting fact : there is another BGP peer
(multihop 2) in 
this same problematic vlan, and it works fine after mtu
change. But I´m 
almost sure it´s a Juniper, so no interop issues ;)

Raniery Pontes

-----------------------------------------

Log example:

Jul 26 10:57:30  jm320_sp rpd[3001]: bgp_event: peer xxx
(External AS 
xxx) old state OpenConfirm event RecvKeepAlive new state
Established

Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_traffic_timeout:
NOTIFICATION 
sent to xxx (External AS xxx): code 4 (Hold Timer Expired
Error), 
Reason: holdtime expired for xxx (External AS xxx), socket
buffer sndcc: 
0 rcvcc: 0 TCP state: 4, snd_una: 3621665251 snd_nxt:
3621665251 
snd_wnd: 16200 rcv_nxt: 3247700880 rcv_adv: 3247759208,
keepalive timer 0

Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_event: peer xxx
(External AS 
xxx) old state Established event HoldTime new state Idle


_______________________________________________
juniper-nsp mailing list juniper-nsppuck.nether.net

https://puck.nether.net/mailman/listinfo/juniper-nsp
MTU changes affecting BGP sessions
user name
2006-07-26 14:56:17
Sounds like the problem session is exceeding some path mtu
resulting in discards (and session loss) when large route
tables are transferred. IIRC the bgp mtu discovery option
should result in packets with the DF bit set, but due to a
bug this behavior only occurred on the initiator side of the
connection; the receiver would use its interface MTU and not
set the DF bit, which prevented accurate PMTU discovery and
resulted in fragmented packets, which in this case were
tossed due to a FW filter. I imagine a L2 switch would just
chuck the jumbo frames with similar results, but not sure
how PMTU is supposed to work lacking explicit icmp frag
required error messages... Anyway, this problem is described
in pr 67373.

Also, curious if you are using "system
internet-options path-mtu-discovery"? This knob is
used in conjunction with the BGP mtu option.


I would confirm the MSS negotiated for the problem session
and go from there:

show system connections extensive | find
<bgp-problem-session-address> | match mss

A fix for pr 67373 went into 7.5R3. Any chance you can
upgrade to see if that resolves?

As FYI: The pr indicates that a learned MSS can persist for
longer than 5 minutes, and that if you return the interface
MTU the problem session may require deactivation for > 5
minutes before it will come up correctly. This delay allows
the previous MSS value to age out so that the new interface
MTU value will again be used to set the MSS.


HTHs


> -----Original Message-----
> From: juniper-nsp-bouncespuck.nether.net 
> [mailto:juniper-nsp-bouncespuck.nether.net] On Behalf
Of 
> Raniery Pontes
> Sent: Wednesday, July 26, 2006 7:15 AM
> To: juniper-nsppuck.nether.net
> Subject: [j-nsp] MTU changes affecting BGP sessions
> 
> 
> Hi everybody,
> 
>   I´d like to know if someone has seen the following
issue.
> 
> I´ve got a M320 (Junos 7.5R2.8) with a few BGP
peerings. 
> These peerings run on vlans over a L2 infrastructure
built on 
> Ethernet switches (Cisco and Foundry). Some of these
switches 
> are configured to support jumbo frames, around 9000
bytes, 
> some are not yet.
> 
> Then I changed physical MTU in the M320 GigE interface
to 
> 9000 bytes.  
> Most  peerings just flapped and came up fine, but one
started 
> to flap continuously. This one stays up for about 90seg
(bgp 
> holdtime?) and goes down again.
> 
> I´ve tried "mtu-discovery" bgp command on
my side, but didn´t 
> help.  No clue about the other side, so far.
> 
> Did anyone see this kind of thing? Hints about a
solution?
> 
> Another interesting fact : there is another BGP peer 
> (multihop 2) in this same problematic vlan, and it
works fine 
> after mtu change. But I´m almost sure it´s a Juniper,
so no 
> interop issues ;)
> 
> Raniery Pontes
> 
> -----------------------------------------
> 
> Log example:
> 
> Jul 26 10:57:30  jm320_sp rpd[3001]: bgp_event: peer
xxx (External AS
> xxx) old state OpenConfirm event RecvKeepAlive new
state Established
> 
> Jul 26 10:59:00  jm320_sp rpd[3001]:
bgp_traffic_timeout: 
> NOTIFICATION sent to xxx (External AS xxx): code 4
(Hold 
> Timer Expired Error),
> Reason: holdtime expired for xxx (External AS xxx),
socket 
> buffer sndcc: 
> 0 rcvcc: 0 TCP state: 4, snd_una: 3621665251 snd_nxt:
3621665251
> snd_wnd: 16200 rcv_nxt: 3247700880 rcv_adv: 3247759208,

> keepalive timer 0
> 
> Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_event: peer
xxx (External AS
> xxx) old state Established event HoldTime new state
Idle
> 
> 
> _______________________________________________
> juniper-nsp mailing list juniper-nsppuck.nether.net
> 
https://puck.nether.net/mailman/listinfo/juniper-nsp
> 

_______________________________________________
juniper-nsp mailing list juniper-nsppuck.nether.net

https://puck.nether.net/mailman/listinfo/juniper-nsp
MTU changes affecting BGP sessions
user name
2006-07-26 14:53:36
If your router is trying to send frames larger than the
actual MTU
supported by the infrastructure or remote side, the frames
will simply
be dropped.  It seems likely your initial update packets are
being
dropped and never received by the remote side.  This would
cause your
session to timeout.  The fact that snd_nxt=snd_una in the
log message
adds credence to this theory, since I believe that usually
indicates
you are trying to retransmit unacknowledged frames.

-Jon

On 7/26/06, Raniery Pontes <ranieryrnp.br> wrote:
>
> Hi everybody,
>
>   I´d like to know if someone has seen the following
issue.
>
> I´ve got a M320 (Junos 7.5R2.8) with a few BGP
peerings. These peerings
> run on vlans over a L2 infrastructure built on Ethernet
switches (Cisco
> and Foundry). Some of these switches are configured to
support jumbo
> frames, around 9000 bytes, some are not yet.
>
> Then I changed physical MTU in the M320 GigE interface
to 9000 bytes.
> Most  peerings just flapped and came up fine, but one
started to flap
> continuously. This one stays up for about 90seg (bgp
holdtime?) and goes
> down again.
>
> I´ve tried "mtu-discovery" bgp command on
my side, but didn´t help.  No
> clue about the other side, so far.
>
> Did anyone see this kind of thing? Hints about a
solution?
>
> Another interesting fact : there is another BGP peer
(multihop 2) in
> this same problematic vlan, and it works fine after mtu
change. But I´m
> almost sure it´s a Juniper, so no interop issues ;)
>
> Raniery Pontes
>
> -----------------------------------------
>
> Log example:
>
> Jul 26 10:57:30  jm320_sp rpd[3001]: bgp_event: peer
xxx (External AS
> xxx) old state OpenConfirm event RecvKeepAlive new
state Established
>
> Jul 26 10:59:00  jm320_sp rpd[3001]:
bgp_traffic_timeout: NOTIFICATION
> sent to xxx (External AS xxx): code 4 (Hold Timer
Expired Error),
> Reason: holdtime expired for xxx (External AS xxx),
socket buffer sndcc:
> 0 rcvcc: 0 TCP state: 4, snd_una: 3621665251 snd_nxt:
3621665251
> snd_wnd: 16200 rcv_nxt: 3247700880 rcv_adv: 3247759208,
keepalive timer 0
>
> Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_event: peer
xxx (External AS
> xxx) old state Established event HoldTime new state
Idle
>
>
> _______________________________________________
> juniper-nsp mailing list juniper-nsppuck.nether.net
> 
https://puck.nether.net/mailman/listinfo/juniper-nsp
>

_______________________________________________
juniper-nsp mailing list juniper-nsppuck.nether.net

https://puck.nether.net/mailman/listinfo/juniper-nsp
MTU changes affecting BGP sessions
user name
2006-07-26 16:45:02
So far, I´ve got three things to investigate:

- "system internet-options path-mtu-discovery" 
(I wasn´t using it)
- pr 67373 (and maybe a JUNOS upgrade)
- Adjust jumbo frames in every switch in the path

Much better than 4 hours ago ...

Thanks a lot everyone.

Raniery



Harry Reynolds escreveu:
> Sounds like the problem session is exceeding some path
mtu resulting in discards (and session loss) when large
route tables are transferred. IIRC the bgp mtu discovery
option should result in packets with the DF bit set, but due
to a bug this behavior only occurred on the initiator side
of the connection; the receiver would use its interface MTU
and not set the DF bit, which prevented accurate PMTU
discovery and resulted in fragmented packets, which in this
case were tossed due to a FW filter. I imagine a L2 switch
would just chuck the jumbo frames with similar results, but
not sure how PMTU is supposed to work lacking explicit icmp
frag required error messages... Anyway, this problem is
described in pr 67373.
>
> Also, curious if you are using "system
internet-options path-mtu-discovery"? This knob is
used in conjunction with the BGP mtu option.
>
>
> I would confirm the MSS negotiated for the problem
session and go from there:
>
> show system connections extensive | find
<bgp-problem-session-address> | match mss
>
> A fix for pr 67373 went into 7.5R3. Any chance you can
upgrade to see if that resolves?
>
> As FYI: The pr indicates that a learned MSS can persist
for longer than 5 minutes, and that if you return the
interface MTU the problem session may require deactivation
for > 5 minutes before it will come up correctly. This
delay allows the previous MSS value to age out so that the
new interface MTU value will again be used to set the MSS.
>
>
> HTHs
>
>
>   
>> -----Original Message-----
>> From: juniper-nsp-bouncespuck.nether.net 
>> [mailto:juniper-nsp-bouncespuck.nether.net] On Behalf
Of 
>> Raniery Pontes
>> Sent: Wednesday, July 26, 2006 7:15 AM
>> To: juniper-nsppuck.nether.net
>> Subject: [j-nsp] MTU changes affecting BGP sessions
>>
>>
>> Hi everybody,
>>
>>   I´d like to know if someone has seen the
following issue.
>>
>> I´ve got a M320 (Junos 7.5R2.8) with a few BGP
peerings. 
>> These peerings run on vlans over a L2
infrastructure built on 
>> Ethernet switches (Cisco and Foundry). Some of
these switches 
>> are configured to support jumbo frames, around 9000
bytes, 
>> some are not yet.
>>
>> Then I changed physical MTU in the M320 GigE
interface to 
>> 9000 bytes.  
>> Most  peerings just flapped and came up fine, but
one started 
>> to flap continuously. This one stays up for about
90seg (bgp 
>> holdtime?) and goes down again.
>>
>> I´ve tried "mtu-discovery" bgp command
on my side, but didn´t 
>> help.  No clue about the other side, so far.
>>
>> Did anyone see this kind of thing? Hints about a
solution?
>>
>> Another interesting fact : there is another BGP
peer 
>> (multihop 2) in this same problematic vlan, and it
works fine 
>> after mtu change. But I´m almost sure it´s a
Juniper, so no 
>> interop issues ;)
>>
>> Raniery Pontes
>>
>> -----------------------------------------
>>
>> Log example:
>>
>> Jul 26 10:57:30  jm320_sp rpd[3001]: bgp_event:
peer xxx (External AS
>> xxx) old state OpenConfirm event RecvKeepAlive new
state Established
>>
>> Jul 26 10:59:00  jm320_sp rpd[3001]:
bgp_traffic_timeout: 
>> NOTIFICATION sent to xxx (External AS xxx): code 4
(Hold 
>> Timer Expired Error),
>> Reason: holdtime expired for xxx (External AS xxx),
socket 
>> buffer sndcc: 
>> 0 rcvcc: 0 TCP state: 4, snd_una: 3621665251
snd_nxt: 3621665251
>> snd_wnd: 16200 rcv_nxt: 3247700880 rcv_adv:
3247759208, 
>> keepalive timer 0
>>
>> Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_event:
peer xxx (External AS
>> xxx) old state Established event HoldTime new state
Idle
>>
>>
>> _______________________________________________
>> juniper-nsp mailing list juniper-nsppuck.nether.net
>> 
https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
>>     


_______________________________________________
juniper-nsp mailing list juniper-nsppuck.nether.net

https://puck.nether.net/mailman/listinfo/juniper-nsp
MTU changes affecting BGP sessions
user name
2006-07-26 17:26:04
> Date: Wed, 26 Jul 2006 13:45:02 -0300
> From: Raniery Pontes <ranieryrnp.br>
> Sender: juniper-nsp-bouncespuck.nether.net
> 
> 
> So far, I=B4ve got three things to investigate:
> 
> - "system internet-options
path-mtu-discovery"  (I wasn=B4t using it)
> - pr 67373 (and maybe a JUNOS upgrade)
> - Adjust jumbo frames in every switch in the path
> 
> Much better than 4 hours ago ...

I believe that you stated that the interface MTU was set to
9000. It is
possibly worth noting that the IP MTU should really be 9000
and the
interface MTU should be at least 9022 to comply with the
jumbo frame
size used on most internation R&E networks.
mtu 9022
family inet {
   mtu 9000;
   address xxx.yyy.zzz.aaa;
}
family inet6 {
    mtu 9000;
    address 2001:tttt::1/64;
}

If MPLS will be used, an even larger interface MTU may be
needed. We set
our interface MTU to 9192 which is the maximum.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley
Lab)
E-mail: obermanes.net			Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3
987B 3751
_______________________________________________
juniper-nsp mailing list juniper-nsppuck.nether.net

https://puck.nether.net/mailman/listinfo/juniper-nsp
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )