List Info

Thread: Re: latency (was: RE: cooling door)




Re: latency (was: RE: cooling door)
country flaguser name
Sweden
2008-03-30 03:17:19
On Sat, 29 Mar 2008, Frank Coluccio wrote:

> Understandably, some applications fall into a class
that requires very-short
> distances for the reasons you cite, although I'm still
not comfortable with the
> setup you've outlined. Why, for example, are you
showing two Ethernet switches
> for the fiber option (which would naturally double the
switch-induced latency),
> but only a single switch for the UTP option?

Yes, I am showing a case where you have switches in each
rack so each rack 
is uplinked with a fiber to a central aggregation switch, as
opposed to 
having a lot of UTP from the rack directly into the
aggregation switch.

> Now, I'm comfortable in ceding this point. I should
have made allowances for this
> type of exception in my introductory post, but didn't,
as I also omitted mention
> of other considerations for the sake of brevity. For
what it's worth, propagation
> over copper is faster propagation over fiber, as copper
has a higher nominal
> velocity of propagation (NVP) rating than does fiber,
but not significantly
> greater to cause the difference you've cited.

The 2/3 speed of light in fiber as opposed to propagation
speed in copper 
was not in my mind.

> As an aside, the manner in which o-e-o and e-o-e
conversions take place when
> transitioning from electronic to optical states, and
back, affects latency
> differently across differing link assembly approaches
used. In cases where 10Gbps

My opinion is that the major factors of added end-to-end
latency in my 
example is that the packet has to be serialisted three times
as opposed to 
once and there are three lookups instead of one. Lookups
take time, 
putting the packet on the wire take time.

Back in the 10 megabit/s days, there were switches that did
cut-through, 
ie if the output port was not being used the instant the
packet came in, 
it could start to send out the packet on the outgoing port
before it was 
completely taken in on the incoming port (when the header
was received, 
the forwarding decision was taken and the equipment would
start to send 
the packet out before it was completely received from the
input port).

> By chance, is the "deserialization" you cited
earlier, perhaps related to this
> inverse muxing process? If so, then that would explain
the disconnect, and if it
> is so, then one shouldn't despair, because there is a
direct path to avoiding this.

No, it's the store-and-forward architecture used in all
modern equipment 
(that I know of). A packet has to be completely taken in
over the wire 
into a buffer, a lookup has to be done as to where this
packet should be 
put out, it needs to be sent over a bus or fabric, and then
it has to be 
clocked out on the outgoing port from another buffer. This
adds latency in 
each switch hop on the way.

As Adrian Chadd mentioned in the email sent after yours,
this can of 
course be handled by modifying or creating new protocols
that handle this 
fact. It's just that with what is available today, this is a
problem. Each 
directory listing or file access takes a bit longer over NFS
with added 
latency, and this reduces performance in current protocols.

Programmers who do client/server applications are starting
to notice this 
and I know of companies that put latency-inducing
applications in the 
development servers so that the programmer is exposed to the
same 
conditions in the development environment as in the real
world. This means 
for some that they have to write more advanced SQL queries
to get 
everything done in a single query instead of asking multiple
and changing 
the queries depending on what the first query result was.

Also, protocols such as SMB and NFS that use message blocks
over TCP have 
to be abandonded and replaced with real streaming protocols
and large 
window sizes. Xmodem wasn't a good idea back then, it's not
a good idea 
now (even though the blocks now are larger than the 128
bytes of 20-30 
years ago).

-- 
Mikael Abrahamsson    email: swmikeswm.pp.se

Re: latency (was: RE: cooling door)
country flaguser name
United States
2008-03-30 09:34:31
swmikeswm.pp.se (Mikael Abrahamsson) writes:

> ...
> Back in the 10 megabit/s days, there were switches that
did cut-through, 
> ie if the output port was not being used the instant
the packet came in, 
> it could start to send out the packet on the outgoing
port before it was 
> completely taken in on the incoming port (when the
header was received, 
> the forwarding decision was taken and the equipment
would start to send 
> the packet out before it was completely received from
the input port).

had packet sizes scaled with LAN transmission speed, i would
agree.  but
the serialization time for 1500 bytes at 10MBit was ~1.2ms,
and went down
by a factor of 10 for FastE (~120us), another factor of 10
for GigE (~12us)
and another factor of 10 for 10GE (~1.2us).  even those of
us using jumbo
grams are getting less serialization delay at 10GE (~7us)
than we used to
get on a DEC LANbridge 100 which did cutthrough after the
header (~28us).

> ..., it's the store-and-forward architecture used in
all modern equipment 
> (that I know of). A packet has to be completely taken
in over the wire 
> into a buffer, a lookup has to be done as to where this
packet should be 
> put out, it needs to be sent over a bus or fabric, and
then it has to be 
> clocked out on the outgoing port from another buffer.
This adds latency in 
> each switch hop on the way.

you may be right about the TCAM lookup times having an
impact, i don't know
if they've kept pace with transmission speed either.  but
someone's theory
here yesterday that software (kernel and IP stack)
architecture is more
likely to be at fault, there are still plenty of "queue
it here, it'll go
out next time the device or timer interrupt handler
fires" and this can be
in the ~1ms or even ~10ms range.  this doesn't show up on
file transfer
benchmarks since packet trains usually do well, but miss an
ACK, or send
a ping, and you'll see a shelf.

> As Adrian Chadd mentioned in the email sent after
yours, this can of 
> course be handled by modifying or creating new
protocols that handle this 
> fact. It's just that with what is available today, this
is a problem. Each 
> directory listing or file access takes a bit longer
over NFS with added 
> latency, and this reduces performance in current
protocols.

here again it's not just the protocols, it's the application
design, that 
has to be modernized.  i've written plenty of code that
tries to cut down
the number of bytes of RAM that get copied or searched,
which ends up not
going faster on modern CPUs (or sometimes going slower)
because of the
minimum transfer size between L2 and DRAM.  similarly, a
program that sped
up on a VAX 780 when i taught it to match the size domain of
its disk I/O
to the 512-byte size of a disk sector, either fails to go
faster on modern
high-bandwidth I/O and log structured file systems, or
actually goes slower.

in other words you don't need NFS/SMB, or E-O-E, or the WAN,
to erode what
used to be performance gains through efficiency.  there's
plenty enough new
latency (expressed as a factor of clock speed) in the path
to DRAM, the
path to SATA, and the path through ZFS, to make it necessary
that any
application that wants modern performance has to be
re-oriented to take
modern (which in this case means, streaming) approach. 
correspondingly,
applications which take this approach, don't suffer as much
when they move
from SATA to NFS or iSCSI.

> Programmers who do client/server applications are
starting to notice this
> and I know of companies that put latency-inducing
applications in the
> development servers so that the programmer is exposed
to the same
> conditions in the development environment as in the
real world.  This
> means for some that they have to write more advanced
SQL queries to get
> everything done in a single query instead of asking
multiple and changing
> the queries depending on what the first query result
was.

while i agree that turning one's SQL into transactions that
are more like
applets (such that, for example, you're sending over the
content for a
potential INSERT that may not happen depending on some
SELECT, because the
end-to-end delay of getting back the SELECT result is so
much higher than
the cost of the lost bandwidth from occasionally sending a
useless INSERT)
will take better advantage of modern hardware and software
architecture
(which means in this case, streaming), it's also necessary
to teach our
SQL servers that ZFS "recordsize=128k" means what
it says, for file system
reads and writes.  a lot of SQL users who have moved to a
streaming model
using a lot of transactions have merely seen their
bottleneck move from the
network into the SQL server.

> Also, protocols such as SMB and NFS that use message
blocks over TCP have 
> to be abandonded and replaced with real streaming
protocols and large 
> window sizes. Xmodem wasn't a good idea back then, it's
not a good idea 
> now (even though the blocks now are larger than the 128
bytes of 20-30 
> years ago).

i think xmodem and kermit moved enough total data volume
(expressed as a
factor of transmission speed) back in their day to deserve
an honourable
retirement.  but i'd agree, if an application is moved to a
new environment
where everything (DRAM timing, CPU clock, I/O bandwidth,
network bandwidth,
etc) is 10X faster, but the application only runs 2X faster,
then it's time
to rethink more.  but the culprit will usually not be new
network latency.
-- 
Paul Vixie

RE: latency (was: RE: cooling door)
country flaguser name
United States
2008-03-30 11:11:40
> -----Original Message-----
> From: owner-nanogmerit.edu [mailto:owner-nanogmerit.edu] On Behalf Of
> Paul Vixie
> Sent: Sunday, March 30, 2008 10:35 AM
> To: nanogmerit.edu
> Subject: Re: latency (was: RE: cooling door)
> 
> 
> swmikeswm.pp.se (Mikael Abrahamsson) writes:
> 
> > Programmers who do client/server applications are
starting to notice
> this
> > and I know of companies that put latency-inducing
applications in the
> > development servers so that the programmer is
exposed to the same
> > conditions in the development environment as in
the real world.  This
> > means for some that they have to write more
advanced SQL queries to
> get
> > everything done in a single query instead of
asking multiple and
> changing
> > the queries depending on what the first query
result was.
> 
> while i agree that turning one's SQL into transactions
that are more
> like
> applets (such that, for example, you're sending over
the content for a
> potential INSERT that may not happen depending on some
SELECT, because
> the
> end-to-end delay of getting back the SELECT result is
so much higher
> than
> the cost of the lost bandwidth from occasionally
sending a useless
> INSERT)
> will take better advantage of modern hardware and
software architecture
> (which means in this case, streaming), it's also
necessary to teach our
> SQL servers that ZFS "recordsize=128k" means
what it says, for file
> system
> reads and writes.  a lot of SQL users who have moved to
a streaming
> model
> using a lot of transactions have merely seen their
bottleneck move from
> the
> network into the SQL server.

I have seen first hand (worked for a company and diagnosed
issues with their
applications from a network perspective, prompting a major
re-write of the
software), where developers work with their SQL servers,
application
servers, and clients all on the same L2 switch.  They often
do not duplicate
the environment they are going to be deploying the
application into, and
therefore assume that the "network" is going to
perform the same.  So, when
there are problems they blame the network.  Often the root
problem is the
architecture of the application itself and not the
"network."  All the
servers and client workstations have Gigabit connections to
the same L2
switch, and they are honestly astonished when there are
issues running the
same application over a typical enterprise network with
clients of different
speeds (10/100/1000, full and/or half duplex). 
Surprisingly, to me, they
even expect the same performance out of a WAN.

Application developers today need a "network" guy
on their team.  One who
can help them understand how their proposed application
architecture would
perform over various customer networks, and that can make
suggestions as to
how the architecture can be modified to allow the
performance of the
application to take advantage of the networks' capabilities.
  Mikael (seems
to) complain that developers have to put latency inducing
applications into
the development environment.  I'd say that those developers
are some of the
few who actually have a clue, and are doing the right
thing.

> > Also, protocols such as SMB and NFS that use
message blocks over TCP
> have
> > to be abandonded and replaced with real streaming
protocols and large
> > window sizes. Xmodem wasn't a good idea back then,
it's not a good
> idea
> > now (even though the blocks now are larger than
the 128 bytes of 20-
> 30
> > years ago).
> 
> i think xmodem and kermit moved enough total data
volume (expressed as
> a
> factor of transmission speed) back in their day to
deserve an
> honourable
> retirement.  but i'd agree, if an application is moved
to a new
> environment
> where everything (DRAM timing, CPU clock, I/O
bandwidth, network
> bandwidth,
> etc) is 10X faster, but the application only runs 2X
faster, then it's
> time
> to rethink more.  but the culprit will usually not be
new network
> latency.
> --
> Paul Vixie

It may be difficult to switch to a streaming protocol if the
underlying data
sets are block-oriented.

Fred Reimer, CISSP, CCNP, CQS-VPN, CQS-ISS
Senior Network Engineer
Coleman Technologies, Inc.
954-298-1697


RE: latency (was: RE: cooling door)
country flaguser name
Sweden
2008-03-30 11:29:57
On Sun, 30 Mar 2008, Fred Reimer wrote:

> application to take advantage of the networks'
capabilities.   Mikael (seems
> to) complain that developers have to put latency
inducing applications into
> the development environment.  I'd say that those
developers are some of the
> few who actually have a clue, and are doing the right
thing.

I was definately not complaining, I brought it up as an
example where 
developers have clue and where they're doing the right
thing.

I've too often been involved in customer complaints which
ended up being 
the fault of Microsoft SMB and the customers having the firm
idea that it 
must be a network problem since MS is a world standard and
that can't be 
changed. Even proposing to change TCP Window settings to get
FTP transfers 
quicker is met with the same sceptisism.

Even after describing to them about the propagation delay of
light in 
fiber and the physical limitations, they're still very
suspicious about it 
all.

-- 
Mikael Abrahamsson    email: swmikeswm.pp.se

RE: latency (was: RE: cooling door)
country flaguser name
United States
2008-03-30 12:18:27
Thanks for the clarification; that's why I put the
"seems to" in the reply.

Fred Reimer, CISSP, CCNP, CQS-VPN, CQS-ISS
Senior Network Engineer
Coleman Technologies, Inc.
954-298-1697


> -----Original Message-----
> From: owner-nanogmerit.edu [mailto:owner-nanogmerit.edu] On Behalf Of
> Mikael Abrahamsson
> Sent: Sunday, March 30, 2008 12:30 PM
> To: nanogmerit.edu
> Subject: RE: latency (was: RE: cooling door)
> 
> 
> On Sun, 30 Mar 2008, Fred Reimer wrote:
> 
> > application to take advantage of the networks'
capabilities.   Mikael
> (seems
> > to) complain that developers have to put latency
inducing
> applications into
> > the development environment.  I'd say that those
developers are some
> of the
> > few who actually have a clue, and are doing the
right thing.
> 
> I was definately not complaining, I brought it up as an
example where
> developers have clue and where they're doing the right
thing.
> 
> I've too often been involved in customer complaints
which ended up
> being
> the fault of Microsoft SMB and the customers having the
firm idea that
> it
> must be a network problem since MS is a world standard
and that can't
> be
> changed. Even proposing to change TCP Window settings
to get FTP
> transfers
> quicker is met with the same sceptisism.
> 
> Even after describing to them about the propagation
delay of light in
> fiber and the physical limitations, they're still very
suspicious about
> it
> all.
> 
> --
> Mikael Abrahamsson    email: swmikeswm.pp.se
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )