|
List Info
Thread: Initial 6.1 questions
|
|
| Initial 6.1 questions |

|
2006-06-12 14:21:04 |
I'm just setting up to evaluate 6.1 for a
project, and before I tune I hoped to get some
feedback on why some things are the way they are.
first, why is the default for HZ now 1000? It
seems that 900 extra clock interrupts aren't a
performance enhancement.
Is there a reason that ITR isn't a tunable in the
em driver? It seems more usable generally to end
users than the delays.
Running a simple test with a traffic generator
(firing udp packets to a blackhole), the system
overhead with a single processor goes up from 10%
to 15% when running a kernel with SMP enabled
(and nothing else different). I have ITR set to
6000 interrupts per second. That seems like an
awful lot of overhead. Is there some problem
running an SMP-enabled kernel when only 1
processor is present, or is there really 50%
extra overhead on an SMP scheduler? I'll have a
dual core in a few days to test with.
Lastly, is there a utility similar to cpustat in
DragonflyBSD which shows the per-cpu usage stats?
Thanks,
DT
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection
around
http://mail.yahoo.com
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 15:00:30 |
On Mon, 12 Jun 2006, Danial Thom wrote:
> first, why is the default for HZ now 1000? It seems
that 900 extra clock
> interrupts aren't a performance enhancement.
This is a design change that is in the process of being
reconsidered. I
expect that HZ will not be 1000 in 7.x, but can't tell you
whether it will go
back to 100, or some middle ground. There are a number of
benefits to a
higher HZ, not least is more accurate timing of some network
timer events.
Since I don't have my hands in the timer code, I can't
speak to what the
decision process here is, or when any change might happen,
but I do expect to
see some change.
> Running a simple test with a traffic generator (firing
udp packets to a
> blackhole), the system overhead with a single processor
goes up from 10% to
> 15% when running a kernel with SMP enabled (and nothing
else different). I
> have ITR set to 6000 interrupts per second. That seems
like an awful lot of
> overhead. Is there some problem running an SMP-enabled
kernel when only 1
> processor is present, or is there really 50% extra
overhead on an SMP
> scheduler? I'll have a dual core in a few days to test
with.
I don't know about the particular number, but there is a
significant overhead
to building in SMP support currently -- in particular, you
pick up a lot of
atomic instructions which increases the cost of locking
operations even
without contention. Some of that overhead reduces as the
workload goes up, as
there's coalescing of work under locked regions, reduced
context switch rates
as work is performed in batches, etc. There is currently
extremely active
work in the area of reducing the overhead of scheduling and
context switching,
being driven in part by the 32-processor support in Sun4v.
I don't expect to
see large portions of that merged to RELENG_6, but it will
be in RELENG_7.
Again, not my area of expertise, but there is work going on
in this area.
Finally, there is a known performance problem involving
loopback network
traffic and preemption, which results in additional context
switches. You may
want to try disabling preemption and see if/how that impacts
your numbers.
There has been seen quite a bit of discussion of this
problem, and I expect to
see a solution for it in the near future. This problem does
not manifest for
remote traffic, only loopback traffic.
Robert N M Watson
Computer Laboratory
Universty of Cambridge
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 19:57:54 |
--- Robert Watson <rwatson FreeBSD.org> wrote:
> On Mon, 12 Jun 2006, Danial Thom wrote:
>
> > first, why is the default for HZ now 1000? It
> seems that 900 extra clock
> > interrupts aren't a performance enhancement.
>
> This is a design change that is in the process
> of being reconsidered. I
> expect that HZ will not be 1000 in 7.x, but
> can't tell you whether it will go
> back to 100, or some middle ground. There are
> a number of benefits to a
> higher HZ, not least is more accurate timing of
> some network timer events.
> Since I don't have my hands in the timer code,
> I can't speak to what the
> decision process here is, or when any change
> might happen, but I do expect to
> see some change.
Will anything break if I tweek this downward?
>
> > Running a simple test with a traffic
> generator (firing udp packets to a
> > blackhole), the system overhead with a single
> processor goes up from 10% to
> > 15% when running a kernel with SMP enabled
> (and nothing else different). I
> > have ITR set to 6000 interrupts per second.
> That seems like an awful lot of
> > overhead. Is there some problem running an
> SMP-enabled kernel when only 1
> > processor is present, or is there really 50%
> extra overhead on an SMP
> > scheduler? I'll have a dual core in a few
> days to test with.
>
> I don't know about the particular number, but
> there is a significant overhead
> to building in SMP support currently -- in
> particular, you pick up a lot of
> atomic instructions which increases the cost of
> locking operations even
> without contention. Some of that overhead
> reduces as the workload goes up, as
> there's coalescing of work under locked
> regions, reduced context switch rates
> as work is performed in batches, etc. There is
> currently extremely active
> work in the area of reducing the overhead of
> scheduling and context switching,
> being driven in part by the 32-processor
> support in Sun4v. I don't expect to
> see large portions of that merged to RELENG_6,
> but it will be in RELENG_7.
> Again, not my area of expertise, but there is
> work going on in this area.
>
> Finally, there is a known performance problem
> involving loopback network
> traffic and preemption, which results in
> additional context switches. You may
> want to try disabling preemption and see if/how
> that impacts your numbers.
> There has been seen quite a bit of discussion
> of this problem, and I expect to
> see a solution for it in the near future. This
> problem does not manifest for
> remote traffic, only loopback traffic.
I'm sending this traffic from an external device,
receiving on an em controller with blackhole set
to 1. So I assume this loopback issue doesn't
apply to this test?
DT
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection
around
http://mail.yahoo.com
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 20:01:28 |
Danial Thom wrote:
>
> --- Robert Watson <rwatson FreeBSD.org> wrote:
>
>
>>On Mon, 12 Jun 2006, Danial Thom wrote:
>>
>>
>>>first, why is the default for HZ now 1000? It
>>
>>seems that 900 extra clock
>>
>>>interrupts aren't a performance enhancement.
>>
>>This is a design change that is in the process
>>of being reconsidered. I
>>expect that HZ will not be 1000 in 7.x, but
>>can't tell you whether it will go
>>back to 100, or some middle ground. There are
>>a number of benefits to a
>>higher HZ, not least is more accurate timing of
>>some network timer events.
>>Since I don't have my hands in the timer code,
>>I can't speak to what the
>>decision process here is, or when any change
>>might happen, but I do expect to
>>see some change.
>
>
> Will anything break if I tweek this downward?
>
I run a number of high-load production systems that
do a lot of network and filesystem activity, all
with HZ set to 100. It has also been shown in the
past that certain things in the network area where
not fixed to deal with a high HZ value, so it's
possible that it's even more stable/reliable with
an HZ value of 100.
My personal opinion is that HZ should gop back down
to 100 in 7-CURRENT immediately, and only be incremented
back up when/if it's proven to be the right thing to do.
And, I say that as someone who (errantly) pushed for the
increase to 1000 several years ago.
Scott
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 20:02:51 |
On Mon, 12 Jun 2006, Danial Thom wrote:
>> This is a design change that is in the process of
being reconsidered. I
>> expect that HZ will not be 1000 in 7.x, but can't
tell you whether it will
>> go back to 100, or some middle ground. There are a
number of benefits to a
>> higher HZ, not least is more accurate timing of
some network timer events.
>> Since I don't have my hands in the timer code, I
can't speak to what the
>> decision process here is, or when any change might
happen, but I do expect
>> to see some change.
>
> Will anything break if I tweek this downward?
No, shouldn't do. I wouldn't go below 100 though, as
things like process
statistics, involuntary context switches, etc, are all
affected.
>> Finally, there is a known performance problem
involving loopback network
>> traffic and preemption, which results in additional
context switches. You
>> may want to try disabling preemption and see if/how
that impacts your
>> numbers. There has been seen quite a bit of
discussion of this problem, and
>> I expect to see a solution for it in the near
future. This problem does
>> not manifest for remote traffic, only loopback
traffic.
>
> I'm sending this traffic from an external device,
receiving on an em
> controller with blackhole set to 1. So I assume this
loopback issue doesn't
> apply to this test?
The above comments only refer to traffic being sent over
if_loop interfaces or
certain other deferred work scenarios. Basically, defering
of work to the
netisr from a user thread rather than an interrupt thread
results in a
premature context switch.
Robert N M Watson
Computer Laboratory
Universty of Cambridge
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 20:08:12 |
On Mon, 12 Jun 2006, Scott Long wrote:
> I run a number of high-load production systems that do
a lot of network and
> filesystem activity, all with HZ set to 100. It has
also been shown in the
> past that certain things in the network area where not
fixed to deal with a
> high HZ value, so it's possible that it's even more
stable/reliable with an
> HZ value of 100.
>
> My personal opinion is that HZ should gop back down to
100 in 7-CURRENT
> immediately, and only be incremented back up when/if
it's proven to be the
> right thing to do. And, I say that as someone who
(errantly) pushed for the
> increase to 1000 several years ago.
I think it's probably a good idea to do it sooner rather
than later. It may
slightly negatively impact some services that rely on
frequent timers to do
things like retransmit timing and the like. But I haven't
done any
measurements.
Robert N M Watson
Computer Laboratory
Universty of Cambridge
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 20:32:48 |
On Mon, Jun 12, 2006 at 09:08:12PM +0100, Robert Watson
wrote:
> On Mon, 12 Jun 2006, Scott Long wrote:
>
> >I run a number of high-load production systems that
do a lot of network
> >and filesystem activity, all with HZ set to 100.
It has also been shown
> >in the past that certain things in the network area
where not fixed to
> >deal with a high HZ value, so it's possible that
it's even more
> >stable/reliable with an HZ value of 100.
> >
> >My personal opinion is that HZ should gop back down
to 100 in 7-CURRENT
> >immediately, and only be incremented back up
when/if it's proven to be the
> >right thing to do. And, I say that as someone who
(errantly) pushed for
> >the increase to 1000 several years ago.
>
> I think it's probably a good idea to do it sooner
rather than later. It
> may slightly negatively impact some services that rely
on frequent timers
> to do things like retransmit timing and the like. But
I haven't done any
> measurements.
As you know, but for the benefit of the list, restoring
HZ=100 is
often an important performance tweak on SMP systems with
many CPUs
because of all the sched_lock activity from
statclock/hardclock, which
scales with HZ and NCPUS.
Kris
|
|
| Initial 6.1 questions |

|
2006-06-12 23:15:52 |
On Tuesday 13 June 2006 04:32, Kris Kennaway wrote:
> On Mon, Jun 12, 2006 at 09:08:12PM +0100, Robert Watson
wrote:
> > On Mon, 12 Jun 2006, Scott Long wrote:
> > >I run a number of high-load production systems
that do a lot of network
> > >and filesystem activity, all with HZ set to
100. It has also been shown
> > >in the past that certain things in the network
area where not fixed to
> > >deal with a high HZ value, so it's possible
that it's even more
> > >stable/reliable with an HZ value of 100.
> > >
> > >My personal opinion is that HZ should gop back
down to 100 in 7-CURRENT
> > >immediately, and only be incremented back up
when/if it's proven to be
> > > the right thing to do. And, I say that as
someone who (errantly) pushed
> > > for the increase to 1000 several years ago.
> >
> > I think it's probably a good idea to do it sooner
rather than later. It
> > may slightly negatively impact some services that
rely on frequent timers
> > to do things like retransmit timing and the like.
But I haven't done any
> > measurements.
>
> As you know, but for the benefit of the list, restoring
HZ=100 is
> often an important performance tweak on SMP systems
with many CPUs
> because of all the sched_lock activity from
statclock/hardclock, which
> scales with HZ and NCPUS.
>
> Kris
sched_lock is another big bottleneck, since if you 32 CPUs,
in theory
you have 32X context switch speed, but now it still has only
1X speed,
and there are code abusing sched_lock, the M:N bits
dynamically inserts
a thread into thread list at context switch time, this is a
bug, this
causes thread list in a proc has to be protected by
scheduler lock,
and delivering a signal to process has to hold scheduler
lock and
find a thread, if the proc has many threads, this will
introduce
long scheduler latency, a proc lock is not enough to find a
thread,
this is a bug, there are other code abusing scheduler lock
which
really can use its own lock.
David Xu
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 23:19:52 |
On Tue, 13 Jun 2006, David Xu wrote:
:On Tuesday 13 June 2006 04:32, Kris Kennaway wrote:
:> On Mon, Jun 12, 2006 at 09:08:12PM +0100, Robert
Watson wrote:
:> > On Mon, 12 Jun 2006, Scott Long wrote:
:> > >I run a number of high-load production
systems that do a lot of network
:> > >and filesystem activity, all with HZ set to
100. It has also been shown
:> > >in the past that certain things in the
network area where not fixed to
:> > >deal with a high HZ value, so it's possible
that it's even more
:> > >stable/reliable with an HZ value of 100.
:> > >
:> > >My personal opinion is that HZ should gop
back down to 100 in 7-CURRENT
:> > >immediately, and only be incremented back up
when/if it's proven to be
:> > > the right thing to do. And, I say that as
someone who (errantly) pushed
:> > > for the increase to 1000 several years ago.
:> >
:> > I think it's probably a good idea to do it
sooner rather than later. It
:> > may slightly negatively impact some services that
rely on frequent timers
:> > to do things like retransmit timing and the like.
But I haven't done any
:> > measurements.
:>
:> As you know, but for the benefit of the list,
restoring HZ=100 is
:> often an important performance tweak on SMP systems
with many CPUs
:> because of all the sched_lock activity from
statclock/hardclock, which
:> scales with HZ and NCPUS.
:>
:> Kris
:
:sched_lock is another big bottleneck, since if you 32 CPUs,
in theory
:you have 32X context switch speed, but now it still has
only 1X speed,
:and there are code abusing sched_lock, the M:N bits
dynamically inserts
:a thread into thread list at context switch time, this is a
bug, this
:causes thread list in a proc has to be protected by
scheduler lock,
:and delivering a signal to process has to hold scheduler
lock and
:find a thread, if the proc has many threads, this will
introduce
:long scheduler latency, a proc lock is not enough to find a
thread,
:this is a bug, there are other code abusing scheduler lock
which
:really can use its own lock.
:
avid Xu
Given that it seems that various scenarios for locking
bottlenecks can
occur on various systems with different numbers of CPUs.
Has there been
any research done on providing "best fit"
profiles for varied N cpu
systems?
Cheers,
Andrew
--
arr watson.org
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
| Initial 6.1 questions |

|
2006-06-12 23:21:08 |
Sorry to reply to myself ...
On Mon, 12 Jun 2006, Andrew R. Reiter wrote:
:On Tue, 13 Jun 2006, David Xu wrote:
:
::On Tuesday 13 June 2006 04:32, Kris Kennaway wrote:
::> On Mon, Jun 12, 2006 at 09:08:12PM +0100, Robert
Watson wrote:
::> > On Mon, 12 Jun 2006, Scott Long wrote:
::> > >I run a number of high-load production
systems that do a lot of network
::> > >and filesystem activity, all with HZ set to
100. It has also been shown
::> > >in the past that certain things in the
network area where not fixed to
::> > >deal with a high HZ value, so it's possible
that it's even more
::> > >stable/reliable with an HZ value of 100.
::> > >
::> > >My personal opinion is that HZ should gop
back down to 100 in 7-CURRENT
::> > >immediately, and only be incremented back up
when/if it's proven to be
::> > > the right thing to do. And, I say that as
someone who (errantly) pushed
::> > > for the increase to 1000 several years ago.
::> >
::> > I think it's probably a good idea to do it
sooner rather than later. It
::> > may slightly negatively impact some services
that rely on frequent timers
::> > to do things like retransmit timing and the
like. But I haven't done any
::> > measurements.
::>
::> As you know, but for the benefit of the list,
restoring HZ=100 is
::> often an important performance tweak on SMP systems
with many CPUs
::> because of all the sched_lock activity from
statclock/hardclock, which
::> scales with HZ and NCPUS.
::>
::> Kris
::
::sched_lock is another big bottleneck, since if you 32
CPUs, in theory
::you have 32X context switch speed, but now it still has
only 1X speed,
::and there are code abusing sched_lock, the M:N bits
dynamically inserts
::a thread into thread list at context switch time, this is
a bug, this
::causes thread list in a proc has to be protected by
scheduler lock,
::and delivering a signal to process has to hold scheduler
lock and
::find a thread, if the proc has many threads, this will
introduce
::long scheduler latency, a proc lock is not enough to find
a thread,
::this is a bug, there are other code abusing scheduler lock
which
::really can use its own lock.
::
: avid Xu
:
:Given that it seems that various scenarios for locking
bottlenecks can
:occur on various systems with different numbers of CPUs.
Has there been
:any research done on providing "best fit"
profiles for varied N cpu
:systems?
Meaning at compile time certain profiles are taken for a
given system to
provide a good effort at providing a "best fit"
for locking with their
system.
:
:Cheers,
:Andrew
:
:--
:arr watson.org
:_______________________________________________
:freebsd-performance freebsd.org mailing list
:http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
:To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
:
:
--
arr watson.org
_______________________________________________
freebsd-performance freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-p
erformance
To unsubscribe, send any mail to
"freebsd-performance-unsubscribe freebsd.org"
|
|
|
|