List Info

Thread: Created: (SOLR-377) speed increase for writers




Created: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-10 13:11:50
speed increase for writers
--------------------------

                 Key: SOLR-377
                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
             Project: Solr
          Issue Type: Improvement
            Reporter: Yonik Seeley


When solr is writing the response of large cached documents,
the bottleneck is string encoding.
a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Updated: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-10 13:24:50
     [ 
https://issues.apache.org/jira/browse/SOLR-377?page=com.atla
ssian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-377:
------------------------------

    Attachment: fastwriter.patch

attaching patch... adds an optimized unsynchronized buffered
writer, changes some ResponseWriters use of strings to
characters, removes buffering of string in JSON, etc.

Speed differences with *very* large documents:
json: 24% faster
ruby: 500% faster (ruby didn't buffer in a StringBuilder
like JSON did)
python: 0% (bottleneck for these huge fields is buffering in
the StringBuilder to see if we should prepend a 'u'...
always prepending a 'u' and not buffering resulted in a ~20%
improvement)
xml: 8% faster

With smaller documents, the speedups are likely to be
greater because small writes like value separators would
matter more.

If there are no objections, I'll commit in a few days.

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Resolved: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-14 13:39:50
     [ 
https://issues.apache.org/jira/browse/SOLR-377?page=com.atla
ssian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-377.
-------------------------------

    Resolution: Fixed

committed.

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Updated: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-17 19:16:50
     [ 
https://issues.apache.org/jira/browse/SOLR-377?page=com.atla
ssian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pieter Berkel updated SOLR-377:
-------------------------------

    Attachment: SOLR-377-phpresponsewriter.patch

Sorry I've been a bit slow catching up with this issue. 
Please find attached a trival patch to
PHPResponseWriter.java that takes advantage of the new
FastWriter code, it should provide speed improvements
similar to the JSON writer (perhaps slightly less).

No fastwriter optimisation is necessary for
PHPSerializedResponseWriter as there is no need to escape
strings before they are written.


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-17 20:42:50
    [ https://issues.apache.org/jira/browse/SO
LR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12535805 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

 Thanks Pieter, I just committed the PHP changes.

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-19 15:05:50
    [ https://issues.apache.org/jira/browse/SO
LR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12536332 ] 

Dave Lewis commented on SOLR-377:
---------------------------------

After this patch, using PHPSerializedResponseWriter returns
output that is unreadable by my PHP application.  I know
that doesn't make any sense, but I'm looking into it now.


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-19 15:46:50
    [ https://issues.apache.org/jira/browse/SO
LR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12536339 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

What container are you using?
Jetty used to have a bug where the Writer they return to the
servlet had issues with chars > 127 if you used
writer.write(string,off,len)


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-19 16:04:50
    [ https://issues.apache.org/jira/browse/SO
LR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12536343 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

FYI, I haven't been able to reproduce any problems along
these lines using the Jetty version that's bundled (and I
set the FastWriter buffer size artificially low to exercise
the boundary handling).


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-19 16:29:50
    [ https://issues.apache.org/jira/browse/SO
LR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12536347 ] 

Yonik Seeley commented on SOLR-377:
-----------------------------------

OK, I think it was a lack of flushing the buffer in the
FastWriter.
I've checked in a patch... can you try with the trunk
version?

> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (SOLR-377) speed increase for writers
country flaguser name
United States
2007-10-19 16:35:50
    [ https://issues.apache.org/jira/browse/SO
LR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12536350 ] 

Dave Lewis commented on SOLR-377:
---------------------------------

That appears to have been it, trunk works great!  Thanks!


> speed increase for writers
> --------------------------
>
>                 Key: SOLR-377
>                 URL: https:
//issues.apache.org/jira/browse/SOLR-377
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Yonik Seeley
>         Attachments: fastwriter.patch,
SOLR-377-phpresponsewriter.patch
>
>
> When solr is writing the response of large cached
documents, the bottleneck is string encoding.
> a buffered writer implementation that doesn't do any
synchronization could offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


[1-10]

about | contact  Other archives ( Real Estate discussion Medical topics )