List Info

Thread: Search performance issues and profiling/debugging search?




Search performance issues and profiling/debugging search?
country flaguser name
Israel
2007-10-23 14:09:25
Hi All..

We are running a small search-test script to test speed of
searches.
We iterate through a list of 2500 searches, executing each.
After a 
while we started noticing that the speed or recurring
searches is not 
always faster than the original.
We started running each search, three times in a row.
We were surprised to see that many times, sequentially
identical 
searches take as much time as the original or even more,
which doesn't 
make much sense considering things should be cached in such
cases.

Some example:
16/7500: Searching for: no recip...*************************
: [Time: 
1.053 sec] [Results: 25 of 18924]
17/7500: Searching for: no recip...*************************
: [Time: 
1.053 sec] [Results: 25 of 18862]
18/7500: Searching for: no recip...*************************
: [Time: 
1.655 sec] [Results: 25 of 17197]  
...
52/7500: Searching for: coupons...*************************
: [Time: 
0.106 sec] [Results: 25 of 78205]
53/7500: Searching for: coupons...*************************
: [Time: 
0.106 sec] [Results: 25 of 79775]
54/7500: Searching for: coupons...*************************
: [Time: 
0.688 sec] [Results: 25 of 72030]          
...
172/7500: Searching for:
microsoft...************************* : [Time: 
0.368 sec] [Results: 25 of 245689]
173/7500: Searching for:
microsoft...************************* : [Time: 
0.29 sec] [Results: 25 of 235657]
174/7500: Searching for:
microsoft...************************* : [Time: 
1.453 sec] [Results: 25 of 233288]        
...
478/7500: Searching for: map of the
world...************************* : 
[Time: 15.703 sec] [Results: 25 of 1355]
479/7500: Searching for: map of the
world...************************* : 
[Time: 10.004 sec] [Results: 25 of 1230]
480/7500: Searching for: map of the
world...************************* : 
[Time: 8.902 sec] [Results: 25 of 1117]    

Especially with the first and last examples, we notice that
search for - 
no recip - takes more than one second, regardless of how
many times we 
execute it, and - map of the world - takes over 8 seconds
constantly.
Some notes about the environment of the search test:
* we open the database only once, so save time.
* hardware, quad xeon, 16GB mem
* data across 10 databases over 5xSATA2 drives
* Database contains 150M docs, each about 3K
* The server is not doing anything beside the search
testing.
Also:
* the Astrix signs in the debug info represent fetching
document data, 
score, matching words and other details from the mset
items.
* Estimates vary, although its exactly the same search done
right one 
after the other with no changes to the DB (no data added).
This is not 
really a big issue.

My questions/concerns:
1. Why would a sequential search take as much time?
Shouldn't all data 
be cached as the query is "warmed up" already? Why
would a third search 
for "microsoft" take so long, not to mention
"map of the world"?
2. Even in the fastest queries, we very rarely see repeated
search 
taking less than 0.1 seconds, even those with the simplest
of terms. 
Does that make sense?
3. For such a hardware we use for testing, 16GB ram, quad
xeon, 5xSATA2 
[500GB each] disks.. How much would you expect it to
"carry" assuming a 
typical document is 3K and we store positional, stemmed
information with 
12-16 values per item and a short integer number in the
document data?
4. Anyone has ideas how we can profile things to see why we
have such 
performance issues? We would be happy to dig deeper to find
out where 
the problems arise from.

Looking forward to hearing your feedback..
Ron

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging search?
country flaguser name
United Kingdom
2007-10-23 16:25:47
Ron Kass wrote:
> * Estimates vary, although its exactly the same search
done right one 
> after the other with no changes to the DB (no data
added). This is not 
> really a big issue.

This is the issue which looks oddest to me, however (though
there are 
other oddities).  If I understood you correctly, each search
is 
performed on the same Database object, without reopening the
Database 
object between each search.  This should result in exactly
the same 
results (and estimates) for each repeated search, since a
Database 
object accesses a snapshot of the underlying database. 
Since this isn't 
happening, there must be something we don't understand about
your setup, 
and this is the first thing to resolve.

Are you using the network (ie, "remote") backend? 
If so, the problem 
could be that some of the database connections are timing
out occasionally.

Is the database mounted locally, or over a network filing
system?

If the database isn't being modified while searches are in
progress, 
and/or Database objects aren't being reopened between
searches, 
something very weird is happening.  Try doing the search on
just a 
single database, and see if you get the same effects.  If
not, add in 
databases gradually until you do.


Other than that, it would be useful to separate timings for
the search 
(ie, the get_mset() call) from the calls which fetch the
document data, 
to see where most of the time is being spent.


 > 4. Anyone has ideas how we can profile things to see
why we have such
 > performance issues? We would be happy to dig deeper to
find out where
 > the problems arise from.

oprofile is your best bet for seeing where the time is being
spent.

-- 
Richard

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging search?
country flaguser name
United Kingdom
2007-10-23 17:08:38
On Tue, Oct 23, 2007 at 10:25:47PM +0100, Richard Boulton
wrote:
> Ron Kass wrote:
> >* Estimates vary, although its exactly the same
search done right one 
> >after the other with no changes to the DB (no data
added). This is not 
> >really a big issue.
> 
> This is the issue which looks oddest to me, however
(though there are 
> other oddities).

It would be interesting to work out why this is happening. 
I suspect
doing so will also reveal why the timings vary.

> If I understood you correctly, each search is 
> performed on the same Database object, without
reopening the Database 
> object between each search.  This should result in
exactly the same 
> results (and estimates) for each repeated search, since
a Database 
> object accesses a snapshot of the underlying database.

I have a theory.  We sort containers of postlists when
building the
postlist tree from the query.  If we don't always use a
stable sort,
then the exact tree built may vary between runs, which I
think could
easily affect both the estimated number of matches and also
how much
work the query requires (and so how long it takes).

If this is happening, we should see if using a stable sort
for such
cases imposes any overhead.  I suspect it wouldn't since
this is a
one-off cost per query, and we're sorting a small number of
items
(at most the number of terms in the query).

Another possibility is that there's some dependence on an
undefined
value.  The testsuite is clean under valgrind, but I'm sure
there's
code which doesn't have testsuite coverage.

Ron: assuming this is on Linux, if you run a few of these
varying cases
under valgrind, does it report any problems?

Cheers,
    Olly

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging search?
country flaguser name
Israel
2007-10-23 21:04:07
Hi Richard & Olly.


Yes, the database is NOT being reopened. The databases are
accessed 
locally, not over the net. No data is being added to the
databases while 
testing the searches.

Indeed I found the estimates to be strange. Wasn't sure why,
but was 
more concerned with performance and the fact that things
were not 
"warming up".


I don't know anything yet about oprofile, will have to dig
deeper there. 
(any pinpoints would be handy)


What was not clear from your answers is if it makes sense
that second 
(and third) time searches take that long. Also, we are
getting 15+ and 
sometimes 50+ seconds searches on some simple (non phrased)
search terms 
(like: [map of the world] [Time: 15.703 sec] [Results: 25 of
1355]; 
[Time: 10.004 sec] [Results: 25 of 1230]; [Results: 25 of
1117]; and 
there are ones that take even more time)


I'm attaching few link to outputs of more details debugs.



First I ran a test on all 8 databases (1, 2, 4, 6, 7, 8, 9
and 10)

http://www.pidgintech.com/other/fts/test/test_
1-2-4-6-7-8-9-10.txt

This resulted in Segmentation fault.



Then I removed database 10 and reran.

http://www.pidgintech.com/other/fts/test/test_1-2
-4-6-7-8-9.txt

This ran successfully. However, Estimates varied: [16044],
[17314], [17219]

And overall times were quite long, both for original search
and for 
sequential searches: [Total Time: 0.96 sec], [Total Time:
0.96 sec], 
[Total Time: 0.941 sec]



Then I ran it on each of the databases separately. Note that
by this 
time it is likely (or at least I though it should be) that
the blocks 
are cached in memory from the previous test (of 1,2,4,6,7,8
and 9).

http://www.pidgintech.com/other/fts/test/test_isolat
ed_1.txt

Summary for DB #1: Estimates fixed on [3015]; Times did not
speed up for 
sequential searches: [Total Time: 0.187 sec], [Total Time:
0.183 sec], 
[Total Time: 0.199 sec]


http://www.pidgintech.com/other/fts/test/test_isolat
ed_2.txt

Summary for DB #2: Estimates fixed on [789]; Times became
faster, 
although first time is extremely long considering the single
DB and its 
size: [Total Time: 0.891 sec], [Total Time: 0.082 sec],
[Total Time: 
0.083 sec]


http://www.pidgintech.com/other/fts/test/test_isolat
ed_4.txt

Summary for DB #4: Estimates fixed on [3217] ; Times became
faster, 
although first time is extremely long, and the others are
still long 
(>0.2sec) considering the single DB: [Total Time: 1.18
sec], [Total 
Time: 0.215 sec], [Total Time: 0.212 sec]


http://www.pidgintech.com/other/fts/test/test_isolat
ed_6.txt

Summary for DB #6: Estimates varied [4447], [3023], [3023];
Times were 
extremely long in the first two searches, and even in the
last one, 
still long (>0.2sec): [Total Time: 1.961 sec], [Total
Time: 4.299 sec], 
[Total Time: 0.204 sec]


http://www.pidgintech.com/other/fts/test/test_isolat
ed_7.txt

Summary for DB #7: Estimates varied [4954], [4954], [3593] ;
Times 
varied and were relatively long overall: [Total Time: 0.81
sec], [Total 
Time: 0.429 sec], [Total Time: 0.779 sec]


http://www.pidgintech.com/other/fts/test/test_isolat
ed_8.txt

Summary for DB #8: Estimates varied [805], [562], [562] ;
Times varied 
and were relatively long for the first 2 tries: [Total Time:
0.431 sec], 
[Total Time: 1.077 sec], [Total Time: 0.063 sec]


http://www.pidgintech.com/other/fts/test/test_isolat
ed_9.txt

Summary for DB #9: Estimates fixed on [3415] ; Times became
faster, 
although first time is extremely long, and the others are
still long 
(>0.2sec): [Total Time: 3.137 sec], [Total Time: 0.222
sec], [Total 
Time: 0.212 sec]


Now I tied #10 (the one who, when added to a group with
add_database, 
causes seg fault), just in case.. and, on its own, it
worked.

http://www.pidgintech.com/other/fts/test/test_isola
ted_10.txt

Summary for DB #10: Estimates fixed on [3415] ; Times were
overall fast: 
[Total Time: 0.108 sec], [Total Time: 0.072 sec], [Total
Time: 0.075 sec]



The following output is output of delve on the 8 databases,
including 
delve on the specific terms (no, recip) to try to figure out
why we are 
getting segmentation fault when database 10 is combined with
the others. 
Note that it manages to run through the delve. I wonder
where the 
segmentation fault is coming from. It happens during the get
of the MSET

    [rootfts1 maintenance]# delve /fts/Database_1/
    number of documents = 20810000
    average document length = 204.054

    [rootfts1 maintenance]# delve /fts/Database_2/
    number of documents = 6020000
    average document length = 203.334

    [rootfts1 maintenance]# delve /fts/Database_4/
    number of documents = 22365000
    average document length = 204.573

    [rootfts1 maintenance]# delve /fts/Database_6/
    number of documents = 21875000
    average document length = 204.567

    [rootfts1 maintenance]# delve /fts/Database_7/
    number of documents = 24480000
    average document length = 205.822

    [rootfts1 maintenance]# delve /fts/Database_8/
    number of documents = 3870000
    average document length = 209.001

    [rootfts1 maintenance]# delve /fts/Database_9/
    number of documents = 23870000
    average document length = 205.659

    [rootfts1 maintenance]# delve /fts/Database_10/
    number of documents = 5325000
    average document length = 202.337

    [rootfts1 maintenance]# delve /fts/Database_10/ -1 -t Zno
| tail -n 5
    5324926
    5324929
    5324952
    5324991
    5324995

    [rootfts1 maintenance]# delve /fts/Database_10/ -1 -t
Zrecip | tail
    -n 5
    5321034
    5324077
    5324081
    5324178
    5324179


We have used a very minimal code, stripped of all the
"other stuff" we 
had in it before. Its simply openning the DB once, executing
the search 
and timing things.


This is very strange (and frustrating).. I hope I gave
enough details. 
If not, please let me know.

Since in some cases, on the nodes that show variance in
estimates, two 
searches show the same estimate while the other is
different, it can be 
that this variance is on all the nodes if tried enough
times.

I think, that speed is overall very poor, especially for
sequential 
searches, considering it should all be in OS cache.


Last notes.. Olly..

1. Regarding your stable-sort, theory, if its something we
can test, let 
me know how.

2. What undefined value might anything depend on?

3. I don't know the testsuite and valgrind and what you
refer to 
regarding that.. I'll have a look into that and into
oprofile. If you 
have any pinpoints, that would help.


Looking forward to your thoughts..

Ron







Richard Boulton wrote:

> Ron Kass wrote:
>> * Estimates vary, although its exactly the same
search done right one 
>> after the other with no changes to the DB (no data
added). This is 
>> not really a big issue.
>
> This is the issue which looks oddest to me, however
(though there are 
> other oddities).  If I understood you correctly, each
search is 
> performed on the same Database object, without
reopening the Database 
> object between each search.  This should result in
exactly the same 
> results (and estimates) for each repeated search, since
a Database 
> object accesses a snapshot of the underlying database. 
Since this 
> isn't happening, there must be something we don't
understand about 
> your setup, and this is the first thing to resolve.
>
> Are you using the network (ie, "remote")
backend?  If so, the problem 
> could be that some of the database connections are
timing out 
> occasionally.
>
> Is the database mounted locally, or over a network
filing system?
>
> If the database isn't being modified while searches are
in progress, 
> and/or Database objects aren't being reopened between
searches, 
> something very weird is happening.  Try doing the
search on just a 
> single database, and see if you get the same effects. 
If not, add in 
> databases gradually until you do.
>
>
> Other than that, it would be useful to separate timings
for the search 
> (ie, the get_mset() call) from the calls which fetch
the document 
> data, to see where most of the time is being spent.
>
>
> > 4. Anyone has ideas how we can profile things to
see why we have such
> > performance issues? We would be happy to dig
deeper to find out where
> > the problems arise from.
>
> oprofile is your best bet for seeing where the time is
being spent.
>

Olly Betts wrote:
> On Tue, Oct 23, 2007 at 10:25:47PM +0100, Richard
Boulton wrote:
>   
>> Ron Kass wrote:
>>     
>>> * Estimates vary, although its exactly the same
search done right one 
>>> after the other with no changes to the DB (no
data added). This is not 
>>> really a big issue.
>>>       
>> This is the issue which looks oddest to me, however
(though there are 
>> other oddities).
>>     
>
> It would be interesting to work out why this is
happening.  I suspect
> doing so will also reveal why the timings vary.
>
>   
>> If I understood you correctly, each search is 
>> performed on the same Database object, without
reopening the Database 
>> object between each search.  This should result in
exactly the same 
>> results (and estimates) for each repeated search,
since a Database 
>> object accesses a snapshot of the underlying
database.
>>     
>
> I have a theory.  We sort containers of postlists when
building the
> postlist tree from the query.  If we don't always use a
stable sort,
> then the exact tree built may vary between runs, which
I think could
> easily affect both the estimated number of matches and
also how much
> work the query requires (and so how long it takes).
>
> If this is happening, we should see if using a stable
sort for such
> cases imposes any overhead.  I suspect it wouldn't
since this is a
> one-off cost per query, and we're sorting a small
number of items
> (at most the number of terms in the query).
>
> Another possibility is that there's some dependence on
an undefined
> value.  The testsuite is clean under valgrind, but I'm
sure there's
> code which doesn't have testsuite coverage.
>
> Ron: assuming this is on Linux, if you run a few of
these varying cases
> under valgrind, does it report any problems?
>
> Cheers,
>     Olly
>   

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging search?
country flaguser name
United Kingdom
2007-10-24 01:58:13
On Wed, Oct 24, 2007 at 04:04:07AM +0200, Ron Kass wrote:
> I don't know anything yet about oprofile, will have to
dig deeper there. 
> (any pinpoints would be handy)

I've written a quick guide (something I've been meaning to
do for a
while):

http://wiki.xa
pian.org/ProfilingXapian
    
> What was not clear from your answers is if it makes
sense that second 
> (and third) time searches take that long.

It suggests that the speed isn't limited by I/O but rather
by CPU.
Profiling data would show us why and hopefully enable such
cases to be
improved.

> First I ran a test on all 8 databases (1, 2, 4, 6, 7,
8, 9 and 10)
> 
> http://www.pidgintech.com/other/fts/test/test_
1-2-4-6-7-8-9-10.txt
> 
> This resulted in Segmentation fault.

Where does it seg fault?  You can find out by running under
gdb like so:

gdb --args <the command to run your test>

And then the seg fault should drop you back to the gdb
prompt, where
you can type "bt" to show a backtrace.

> 1. Regarding your stable-sort, theory, if its something
we can test, let 
> me know how.

I could provide a patch, but you'd need to be using SVN HEAD
because the
code in this area has changed a lot since 1.0.3.

SVN HEAD is also a lot faster on some cases (more than twice
as fast in
tests replaying tweakers.net query logs against their
database).  So if
you're going to profile it would be a lot more useful to
profile that.

> 2. What undefined value might anything depend on?

An uninitialised variable somewhere perhaps.

> 3. I don't know the testsuite and valgrind and what you
refer to 
> regarding that..

Don't worry about the testsuite.  Just try running your test
code under
valgrind:

valgrind <the command to run your test>

It'll run rather more slowly and use quite a bit more memory
than
normal.  Any problems valgrind spots are reported to stderr
by default.

Cheers,
    Olly

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging search?
country flaguser name
Israel
2007-10-24 07:49:05
Hi Olly.


We've installed both valgrind and oprofile, and managed to
get valgrind 
working (but not oprofile.. see error below.. still working
on it, but 
if you have an insight it would be helpful)

Anyway, we have actually used xapian-compress on the
databases to see if 
it helps. It appears to have rid of the segmentation fault
error on 
database 10, but the slowness and the variations in
estimates still exist.


I'm linking to two outputs of valgrind and valgrind
--leak-check=full

http://www.pidgintech.com/other/fts/test/val
grind_2-4-6-7-8-9-10.txt

http://www.pidgintech.com/other/fts/tes
t/valgrind-full_2-4-6-7-8-9-10.txt


And here is the error we are getting when trying to start
-deamon oprofile

    Oct 24 01:21:27 fts1 kernel: BUG: soft lockup detected
on CPU#3!
    Oct 24 01:21:27 fts1 kernel:  [<c043ea0f>]
softlockup_tick+0x98/0xa6
    Oct 24 01:21:27 fts1 kernel:  [<c0408b7d>]
timer_interrupt+0x504/0x557
    Oct 24 01:21:27 fts1 kernel:  [<c043ec43>]
handle_IRQ_event+0x27/0x51
    Oct 24 01:21:27 fts1 kernel:  [<c043ed00>]
__do_IRQ+0x93/0xe8
    Oct 24 01:21:27 fts1 kernel:  [<c040672b>]
do_IRQ+0x93/0xae
    Oct 24 01:21:27 fts1 kernel:  [<c053a04d>]
evtchn_do_upcall+0x64/0x9b
    Oct 24 01:21:27 fts1 kernel:  [<c0404ec5>]
hypervisor_callback+0x3d/0x48
    Oct 24 01:21:27 fts1 kernel:  [<c0407fd1>]
raw_safe_halt+0x8c/0xaf
    Oct 24 01:21:27 fts1 kernel:  [<c0402bca>]
xen_idle+0x22/0x2e
    Oct 24 01:21:27 fts1 kernel:  [<c0402ce9>]
cpu_idle+0x91/0xab
    Oct 24 01:21:27 fts1 kernel: 
    =======================                     

Best regards,

Ron



Olly Betts wrote:

> On Wed, Oct 24, 2007 at 04:04:07AM +0200, Ron Kass
wrote:
>   
>> I don't know anything yet about oprofile, will have
to dig deeper there. 
>> (any pinpoints would be handy)
>>     
>
> I've written a quick guide (something I've been meaning
to do for a
> while):
>
> http://wiki.xa
pian.org/ProfilingXapian
>     
>   
>> What was not clear from your answers is if it makes
sense that second 
>> (and third) time searches take that long.
>>     
>
> It suggests that the speed isn't limited by I/O but
rather by CPU.
> Profiling data would show us why and hopefully enable
such cases to be
> improved.
>
>   
>> First I ran a test on all 8 databases (1, 2, 4, 6,
7, 8, 9 and 10)
>>
>> http://www.pidgintech.com/other/fts/test/test_
1-2-4-6-7-8-9-10.txt
>>
>> This resulted in Segmentation fault.
>>     
>
> Where does it seg fault?  You can find out by running
under gdb like so:
>
> gdb --args <the command to run your test>
>
> And then the seg fault should drop you back to the gdb
prompt, where
> you can type "bt" to show a backtrace.
>
>   
>> 1. Regarding your stable-sort, theory, if its
something we can test, let 
>> me know how.
>>     
>
> I could provide a patch, but you'd need to be using SVN
HEAD because the
> code in this area has changed a lot since 1.0.3.
>
> SVN HEAD is also a lot faster on some cases (more than
twice as fast in
> tests replaying tweakers.net query logs against their
database).  So if
> you're going to profile it would be a lot more useful
to profile that.
>
>   
>> 2. What undefined value might anything depend on?
>>     
>
> An uninitialised variable somewhere perhaps.
>
>   
>> 3. I don't know the testsuite and valgrind and what
you refer to 
>> regarding that..
>>     
>
> Don't worry about the testsuite.  Just try running your
test code under
> valgrind:
>
> valgrind <the command to run your test>
>
> It'll run rather more slowly and use quite a bit more
memory than
> normal.  Any problems valgrind spots are reported to
stderr by default.
>
> Cheers,
>     Olly
>   
_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging
country flaguser name
United Kingdom
2007-10-24 07:58:42
Ron Kass wrote:
>     Oct 24 01:21:27 fts1 kernel:  [<c0402bca>]
xen_idle+0x22/0x2e

I see that you're running this on a Xen server, is this a
Dom0 or
a DomU instance?  Either way I can envisage that the other
VMs
could well be confusing your results.

_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging
country flaguser name
Israel
2007-10-24 08:35:52
Hi Chris
Not sure what you mean by "other VMs could well be
confusing your results"
We use XenServer on this machine, but we have only one
instance (DomU), 
and only this instance is running everything locally. So
there are no 
other VMs to confuse things, and even if there were, they
have nothing 
to do with the VM we run the test on or with the test
itself.
(Can you clarify what you mean?)

Cheers,
Ron

Chris Good wrote:
> Ron Kass wrote:
>   
>>     Oct 24 01:21:27 fts1 kernel: 
[<c0402bca>] xen_idle+0x22/0x2e
>>     
>
> I see that you're running this on a Xen server, is this
a Dom0 or
> a DomU instance?  Either way I can envisage that the
other VMs
> could well be confusing your results.
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discusslists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>   
_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging
country flaguser name
United Kingdom
2007-10-24 08:48:27
Ron Kass wrote:
> Not sure what you mean by "other VMs could well be
confusing your results"
> We use XenServer on this machine, but we have only one
instance (DomU), 
> and only this instance is running everything locally.
So there are no 
> other VMs to confuse things, and even if there were,
they have nothing 
> to do with the VM we run the test on or with the test
itself.
> (Can you clarify what you mean?)

If you have multiple VMs sharing the same hardware then
activity on one
will obviously affect the performance on other VMs.  Since
you're running
a lone DomU other DomUs aren't going to be competing for
resources but 
it's possible that something in Dom0 is getting swapped in
and running.

How are you accessing your drives, is DomU accessing the raw
devices or is
it mapped via virtual files from Dom0?

Is it possible to run these tests either directly from Dom0
or even better
with a non-xen kernel?

Given your current configuration of a single VM xen isn't
adding anything 
so removing it would eliminate any side-effects of it.  I
also suspect
that it would cure your oprofile issue.

Chris


_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

Re: Search performance issues and profiling/debugging
country flaguser name
Israel
2007-10-24 09:04:22
Hi Chris

We prefer to work with Xen at the moment as in the future we
will likely 
use Xapian over Xen anyway.

What you refer to as competing resources, might effect
performance but 
not the variations in estimates. Also, the fact that the
search is 
consistently slow in repeating searches for "no
recip" but not for other 
searches, hints that it is not a Xen resource competing
issue.
Again, the only thing that this machine is doing is this
test. Dom0 does 
nothing. Its not stressed, its not using any resources at
all.

Although we should never rule out something completely
without checking, 
I believe quite strongly that the issues we are seeing are
not coming 
from Xen, as per this instance it is a regular dedicated
Linux (centos 
5) machine and the resources are fully dedicated to it.

Cheers,
Ron


Chris Good wrote:
> Ron Kass wrote:
>   
>> Not sure what you mean by "other VMs could
well be confusing your results"
>> We use XenServer on this machine, but we have only
one instance (DomU), 
>> and only this instance is running everything
locally. So there are no 
>> other VMs to confuse things, and even if there
were, they have nothing 
>> to do with the VM we run the test on or with the
test itself.
>> (Can you clarify what you mean?)
>>     
>
> If you have multiple VMs sharing the same hardware then
activity on one
> will obviously affect the performance on other VMs. 
Since you're running
> a lone DomU other DomUs aren't going to be competing
for resources but 
> it's possible that something in Dom0 is getting swapped
in and running.
>
> How are you accessing your drives, is DomU accessing
the raw devices or is
> it mapped via virtual files from Dom0?
>
> Is it possible to run these tests either directly from
Dom0 or even better
> with a non-xen kernel?
>
> Given your current configuration of a single VM xen
isn't adding anything 
> so removing it would eliminate any side-effects of it. 
I also suspect
> that it would cure your oprofile issue.
>
> Chris
>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discusslists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>   
_______________________________________________
Xapian-discuss mailing list
Xapian-discusslists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-discuss

[1-10] [11-15]

about | contact  Other archives ( Real Estate discussion Medical topics )