|
List Info
Thread: Created: (SOLR-344) New Java API
|
|
| Created: (SOLR-344) New Java API |
  United States |
2007-08-28 04:17:30 |
New Java API
------------
Key: SOLR-344
URL: https:
//issues.apache.org/jira/browse/SOLR-344
Project: Solr
Issue Type: Improvement
Components: clients - java, search, update
Affects Versions: 1.3
Reporter: Jonathan Woods
The core Solr codebase urgently needs to expose a new Java
API designed for use by Java running in Solr's JVM and
ultimately by core Solr code itself. This API must be (i)
object-oriented ('typesafe'), (ii) self-documenting, (iii)
at the right level of granularity, (iv) designed
specifically to expose the value which Solr adds over and
above Lucene.
This is an urgent issue for two reasons:
- Java-Solr integrations represent a use-case which is
nearly as important as the core Solr use-case in which
non-Java clients interact with Solr over HTTP
- a significant proportion of questions on the mailing lists
are clearly from people who are attempting such integrations
right now.
This point in Solr development - some way out from the 1.3
release - might be the right time to do the development and
refactoring necessary to produce this API. We can do this
without breaking any backward compatibility from the point
of view of XML/HTTP and JSON-like clients, and without
altering the core Solr algorithms which make it so
efficient. If we do this work now, we can significantly
speed up the spread of Solr.
Eventually, this API should be part of core Solr code, not
hived off into some separate project nor in a
non-first-class package space. It should be capable of
forming the foundation of any new Solr development which
doesn't need to delve into low level constructs like DocSet
and so on - and any new development which does need to do
just that should be a candidate for incorporation into the
API at the some level. Whether or not it will ever be worth
re-writing existing code is a matter of opinion; but the
Java API should be such that if it had existed before core
plug-ins were written, it would have been natural to use it
when writing them.
I've attached a PDF which makes the case for this API.
Apologies for delivering it as an attachment, but I wanted
to embed pics and a bit of formatting.
I'll update this issue in the next few days to give a
prototype of this API to suggest what it might look like at
present. This will build on the work already done in Solrj
and SearchComponents (https:
//issues.apache.org/jira/browse/SOLR-281), and will be a
patch on an up-to-date revision of Solr trunk.
[PS:
1. Having written most of this, I then properly looked at
SearchComponents/SOLR-281 and read http://www.nabble.com/forum/ViewPost.jtp?post
=11050274&framed=y, which says much the same thing
albeit more quickly! And weeks ago, too. But this proposal
is angled slightly differently:
- it focusses on the value of creating an API not only for
internal Solr consumption, but for local Java clients
- it focusses on designing a Java API without constantly
being hobbled by HTTP-Java
- it's suggesting that the SearchComponents work should
result in a Java API which can be used as much by third
party Java as by ResponseBuilder.
2. I've made some attempt to address Hoss's point (http://www.nabble.com/se
arch-components-%28plugins%29-tf3898040.html#655109757945487
5774) - that an API like this would need to maintain
enough state e.g. to allow an initial search to later be
faceted, highlighted etc without going back to the start
each time - but clearly the proof of the pudding will be in
the prototype.
3. Again, I've just discovered SOLR-212
(DirectSolrConnection). I think all my comments about Solrj
apply to this, useful though it clearly is.]
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
|
|
| Updated: (SOLR-344) New Java API |
  United States |
2007-08-28 04:19:30 |
[
https://issues.apache.org/jira/browse/SOLR-344?page=com.atla
ssian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Woods updated SOLR-344:
--------------------------------
Attachment: New Java API for Solr.pdf
> New Java API
> ------------
>
> Key: SOLR-344
> URL: https:
//issues.apache.org/jira/browse/SOLR-344
> Project: Solr
> Issue Type: Improvement
> Components: clients - java, search, update
> Affects Versions: 1.3
> Reporter: Jonathan Woods
> Attachments: New Java API for Solr.pdf
>
>
> The core Solr codebase urgently needs to expose a new
Java API designed for use by Java running in Solr's JVM and
ultimately by core Solr code itself. This API must be (i)
object-oriented ('typesafe'), (ii) self-documenting, (iii)
at the right level of granularity, (iv) designed
specifically to expose the value which Solr adds over and
above Lucene.
> This is an urgent issue for two reasons:
> - Java-Solr integrations represent a use-case which is
nearly as important as the core Solr use-case in which
non-Java clients interact with Solr over HTTP
> - a significant proportion of questions on the mailing
lists are clearly from people who are attempting such
integrations right now.
> This point in Solr development - some way out from the
1.3 release - might be the right time to do the development
and refactoring necessary to produce this API. We can do
this without breaking any backward compatibility from the
point of view of XML/HTTP and JSON-like clients, and without
altering the core Solr algorithms which make it so
efficient. If we do this work now, we can significantly
speed up the spread of Solr.
> Eventually, this API should be part of core Solr code,
not hived off into some separate project nor in a
non-first-class package space. It should be capable of
forming the foundation of any new Solr development which
doesn't need to delve into low level constructs like DocSet
and so on - and any new development which does need to do
just that should be a candidate for incorporation into the
API at the some level. Whether or not it will ever be worth
re-writing existing code is a matter of opinion; but the
Java API should be such that if it had existed before core
plug-ins were written, it would have been natural to use it
when writing them.
> I've attached a PDF which makes the case for this API.
Apologies for delivering it as an attachment, but I wanted
to embed pics and a bit of formatting.
> I'll update this issue in the next few days to give a
prototype of this API to suggest what it might look like at
present. This will build on the work already done in Solrj
and SearchComponents (https:
//issues.apache.org/jira/browse/SOLR-281), and will be a
patch on an up-to-date revision of Solr trunk.
> [PS:
> 1. Having written most of this, I then properly looked
at SearchComponents/SOLR-281 and read http://www.nabble.com/forum/ViewPost.jtp?post
=11050274&framed=y, which says much the same thing
albeit more quickly! And weeks ago, too. But this proposal
is angled slightly differently:
> - it focusses on the value of creating an API not only
for internal Solr consumption, but for local Java clients
> - it focusses on designing a Java API without
constantly being hobbled by HTTP-Java
> - it's suggesting that the SearchComponents work should
result in a Java API which can be used as much by third
party Java as by ResponseBuilder.
> 2. I've made some attempt to address Hoss's point (http://www.nabble.com/se
arch-components-%28plugins%29-tf3898040.html#655109757945487
5774) - that an API like this would need to maintain
enough state e.g. to allow an initial search to later be
faceted, highlighted etc without going back to the start
each time - but clearly the proof of the pudding will be in
the prototype.
> 3. Again, I've just discovered SOLR-212
(DirectSolrConnection). I think all my comments about Solrj
apply to this, useful though it clearly is.]
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
|
|
| Commented: (SOLR-344) New Java API |
  United States |
2007-09-05 16:53:34 |
[
HTTPS://ISSUES.APACHE.ORG/JIRA/BROWSE/SOLR-344?PAGE=COM.ATLA
SSIAN.JIRA.PLUGIN.SYSTEM.ISSUETABPANELS:COMMENT-TABPANEL#ACT
ION_12525231 ]
HOSS MAN COMMENTED ON SOLR-344:
-------------------------------
I'VE ONLY HAD A CHANCE TO SKIM THE ATTACHED PDF ... I'VE
PRINTED IT OUT IN THE HOPES THAT I'LL FIND SOME TIME TO READ
IN DEPTH YOUR SPECIFIC IDEAS ABOUT WHAT THE IDEAL SOLR API
SHOULD BE; BUT THERE ARE A FEW THINGS THAT JUMPED OUT AT ME
THAT I WANTED TO ADDRESS WHILE THEY WERE ON MY MIND...
-- MOTIVATION --
- DIRECT JAVA IS "BETTER" -
A KEY ASSUMPTION IN THIS PROPOSAL SEEMS TO BE THAT "IF
YOU ARE WRITING A JAVA APP, AND YOU WANT TO USE SOLR, YOU
SHOULD NOT USE THE HTTP INTERFACE" I WOULD ARGUE
STRONGLY AGAINST THIS ASSUMPTION. THERE ARE *LOTS* OF
REASONS WHY IT MAKES SENSE TO TREAT SOLR AS A WEBSERVICE AND
INTERACT WITH IT OVER HTTP INSTEAD OF HAVING A TIGHT
COUPLING WITH YOUR JAVA APPLICATION: REDUNDANCY, LOAD
BALANCING, ... EVEN IF SOMEONE HAD A SITUATION WHERE THEY
ONLY HAD ONE MACHINE IN THEIR ENTIRE OPERATION, AND ALL OF
THEIR APPLICATIONS RAN ON THAT MACHINE I WOULD STILL SUGGEST
INSTALLING A SERVLET CONTAINER AND USING SOLR THAT WAY
BECAUSE IT'S LIKELY THEY WILL HAVE MORE THEN ONE APPLICATION
THAT WILL WANT TO DEAL WITH THEIR INDEX. SOLR CAN MAKE A
LOT OF GOOD OPTIMIZATIONS AND ASSUMPTIONS THAT GO RIGHT OUT
THE WINDOW IF YOU TRY TO EMBED SOLR IN 2 DIFFERENT APPS
READING AND WRITING TO THE SAME PHYSICAL INDEX DIRECTORY.
EVEN IF COMPELLING STATS CAN BE PRESENTED THAT THE
HTTP+XML/JSON OVERHEAD IS IN FACT A BOTTLENECK, I WOULD
STILL THINK THAT PURSUING SOMETHING LIKE AN RMI BASED
CLIENT/SERVER API IN ADDITION TO THE HTTP API WOULD MAKE
MORE SENSE THEN ENCOURAGING PEOPLE TO USE DIRECTLY IN THE
JVM OF THEIR OTHER APPLICATIONS. EVEN THE PLUGIN MODEL (FOR
EMBEDDING YOUR CUSTOM JAVA CODE INTO SOLR) IS SOMETHING I
ONLY RECOMMEND IN SITUATIONS WHERE IT MAKES A LOT OF SENSE
FOR THAT LOGIC TO TIED CLOSELY WITH THE SOLR OR LUCENE
INTERNALS (IE: AS PART OF THE TOKENSTREAM, OR DEALING WITH
THE DOCSETS BEFORE THEY ARE CACHED, ETC...)
THE #1 "VALUE ADD" THAT SOLR HAS OVER LUCENE IS
THE CLIENT/SERVER ABSTRACTION ... THERE ARE CERTAINLY OTHER
VALUE ADDS -- SOME SMALL (LIKE ADDED TOKENFILTERS) AND SOME
BIG (LIKE THE INDEXSCHEMA CONCEPT) -- AND MANY OF THESE
COULD PROBABLY BE REFACTORED INTO THE LUCENE CORE (OR A
LUCENE CONTRIB) SO THEY COULD BE REUSED BY OTHER LUCENE
APPLICATIONS IN ADDITION TO SOLR ... BUT SOLR *IS* AN
APPLICATION.
ARGUING THAT YOU SHOULDN'T BOTHER USING A CLIENT/SERVER
RELATIONSHIP TO DEAL WITH SOLR IF YOUR APPLICATION IS
WRITTEN IN JAVA IS LIKE ARGUING THAT YOU SHOULDN'T BOTHER
USING A CLIENT/SERVER RELATIONSHIP TO DEAL WITH MYSQL IF
YOUR APPLICATION IS WRITTEN IN C.
- DEMAND FOR DIRECT ACCESS -
THE STATEMENT "A SIGNIFICANT PROPORTION OF QUESTIONS ON
THE MAILING LISTS ARE CLEARLY FROM PEOPLE WHO ARE ATTEMPTING
SUCH INTEGRATIONS RIGHT NOW." DOES NOT SERVE AS A CLEAR
CALL TO ACTION ... EVEN IF A SIGNIFICANT NUMBER OF RECENT
QUESTIONS HAVE RELATED TO EMBEDDED SOLR (AND I'M NOT
CONVINCED THE NUMBER IS THAT SIGNIFICANT) THAT ONE DATA
POINT ALONE DOES NOT CLEARLY INDICATE THAT IT IS
IMPORTANT/URGENT TO MAKE THIS EASIER TO DO. IT JUST
INDICATES THAT THE PEOPLE WHO ARE ATTEMPTING TO DO THIS HAVE
QUESTIONS ABOUT HOW TO DO IT ... WHICH ISN'T THAT SUPRISING
CONSIDERING IT'S A RELATIVELY NEW CONCEPT THAT HASN'T REALLY
BEEN DOCUMENTED. SOME OF THESE PEOPLE MAY JUST BE ASSUMING
THAT THEY *NEED* TO EMBED SOLR IN THEIR EXISTING JAVA
APPLICATIONS BECAUSE THEY DON'T REALIZE IT'S INTENDED TO BE
USED AS A SERVER.
THE JAVA-USER LUCENE LIST GETS LOTS OF QUESTIONS FROM
PEOPLE WHO MISUNDERSTAND THE THE DEMO CODE IN THE LUCENE
DISTRIBUTION AND THINK LUCENE IS AN APPLICATION THAT THEY
CAN RUN ON THE COMMAND LINE TO INDEX FILES AND SEARCH THEM
-- THAT DOESN'T MEAN THAT THE LUCENE-JAVA PROJECT SHOULD
REVAMP ITSELF TO FOCUS ON PRODUCING AN APPLICATION INSTEAD
OF A LIBRARY, IT MEANS THE LUCENE-JAVA COMMUNITY HAS TO HELP
EDUCATE USERS ABOUT: A) HOW THEY CAN USE THE LUCENE LIBRARY
TO BUILD THEIR OWN APPS; AND B) WHAT APPS ARE BUILT ON TOP
OF THE LUCENE LIBRARY THAT MIGHT BE USEFUL TO THEM.
I THINK IT WOULD PROBABLY BE MORE BENEFICIAL FOR THE
COMMUNITY AS A WHOLE IF PEOPLE SPENT MORE TIME/ENERGY
DOCUMENTING THE BENEFITS/MECHANISMS OF USING SOLR AS A
SERVER, OR IMPROVING THE CLIENT APIS TO MAKE COMMUNICATING
WITH A SOLR SERVER FASTER/EASIER THEN IT WOULD TO DEDICATE A
LOT OF RESOURCES SOLELY TOWARDS MAKING SOLR MORE OF A
LIBRARY AND LESS OF AN APPLICATION.
-- STRATEGY FOR MAKING CHANGES --
ALL THAT SAID -- I AGREE WITH YOU THAT A LOT OF IMPROVEMENTS
CAN AND SHOULD BE MADE TO THE INTERNAL APIS. NOT BECAUSE I
THINK WE NEED TO MAKE IT EASIER TO EMBED SOLR, BUT TO MAKE
IT EASIER FOR NEW DEVELOPERS TO WORK ON THE SOLR INTERNALS
(OR TO WRITE PLUGINS). IF EMBEDDING SOLR GETS EASIER AS A
RESULT -- GREAT, BUT I DON'T SEE THAT AS A COMPELLING REASON
FOR CHANGE.
SOMEWHERE IN YOUR DOC, YOU ADVOCATED THE IMPORTANCE OF A TOP
DOWN COMPLETE API OVERHAUL INSTEAD OF APPROACHING THINGS
PIECEMEAL (FORGIVE ME FOR NOT REMEMBERING EXACTLY HOW YOU
PUT IT, I'M NOT TRYING TO PUT WORDS IN YOUR MOUTH I JUST
REMEMBER THERE BEING A SENTIMENT LIKE THIS) ... WHILE I
THINK IT WOULD DEFINITELY MAKE SENSE TO HAVE SOME
DISCUSSIONS ON SOLR-DEV ABOUT WHAT THE BIG PROBLEMS ARE WITH
THE INTERNAL APIS AND COME UP WITH A HIGH LEVEL PICTURE OF
WHAT THE IDEAL API MIGHT BE SO WE CAN AIM FOR IT, THE BEST
WAY TO GET THERE IS WITH SMALL PATCHES THAT FOCUSES ON A
SINGLE AREA.
I SAY THIS FROM EXPERIENCE AS SOMEONE WHO HAS SUBMITTED
PATCHES TO PROJECTS, AND AS A COMMITTER WHO HAS TO REVIEW
PATCHES: BIG PATCHES THAT CHANGE A LOT OF THINGS TAKE A LOT
MORE WORK/DISCUSSION/THOUGHT TO REVIEW AND GENERALLY SPEND A
LOT LONGER SITTING IN JIRA THEN SHORTER MOST FOCUSED PATCHES
(SOME DAY I'LL SIT DOWN AND DO THE MATH AND WRITE OUT
"HOSS'SS PATCH SIZE THEOREM" BUT FOR NOW TAKE MY
WORD FOR IT THAT THERE'S AN EXPONENTIAL FACTOR IN THERE
SOMEWHERE). THE BEST WAY TO PROCEED IS PROBABLE TO START BY
TACKLING INDIVIDUAL PIECES OF FUNCTIONALITY, ADDING THE API
YOU THINK THERE SHOULD BE, AND REFACTORING THE CURRENT CODE
TO IMPLEMENT/USE THAT API (LEAVING THE OLD ONE AROUND AS
DEPRECATED).
-- LOOSE APIS VS TIGHT APIS --
WHILE I AGREE THERE ARE A LOT OF PLACES WHERE THING LIKE
NAMEDLIST ARE OVERUSED, DON'T DISCOUNT THE VALUE ADD THAT
THIS KIND OF "PASS THROUGH" API ALLOWS ... THE
DECISION TO USE THINGS LIKE THE SOLRPARAMS CLASS IN SOME
UTILITY CLASSES WAS MADE CONSCIOUSLY IN A LOT OF CASES, IN
ORDER TO MAKE IT EASIER FOR THESE UTILITIES TO GROW AND
EVOLVE WITHOUT THEIR CALLERS NEEDING TO BE AWARE OF THESE
NEW CHANGES ... SIMPLEFACETS FOR EXAMPLE TAKES IN A GENERIC
SOLRPARAMS AND RETURNS A NAMEDLIST SO THAT AS NEW
FUNCTIONALITY IS ADDED AND NEW PARAMS ARE ADDED TO CONTROL
THAT FUNCTIONALITY EXISTING REQUEST HANDLERS DON'T HAVE TO
BE SPECIFICLY AWARE OF ALL THOSE PARAM NAMES IN ORDER TO GET
THAT FUNCTIONALITY. THEY CAN BE IF THEY WANT: THEY CAN
CONSTRUCT A SOLRPARAMS INSTANCE JUST FOR DRIVING
SIMPLEFACETS BEHAVIOR INSTEAD OF PASSING THROUGH THE MAIN
REQUEST PARAMS, IT'S THEIR CHOICE ... BUT A VERY SPECIFIC
API, WHERE EVERY QUERY PARAM WAS MAPPED TO A CONSTRUCTOR ARG
OR A SETTER METHOD OR A COMMAND PATTERN OBJECT OR SOMETHING
ELSE THAT HAD A TIGHTER COUPLING WOULD REQUIRE CHANGES IN
REQUESTHANDLERS ANYTIME SOMETHING LIKE DATE FACETING WAS
ADDED (OR EVEN FACET.MINCOUNT)
IF I REMEMBER CORRECTLY, YOU POINTED OUT IN THE MAILING LIST
THAT THINGS LIKE SIMPLEFACETS OR THE HIGHLIGHTING UTILS
SHOULDN'T RETURN NAMEDLISTS -- IT SHOULD RETURN A MORE
SPECIFIC FACETRESULTS/HIGHLIGHTRESULTS OBJECTS ... I WOULD
DEFINITELY BE ON BOARD PATCHES LIKE THAT. REFACTORING THE
CODE TO USE A WELL TYPED RESPONSE OBJECT CERTAINLY WOULD
MAKE THE CODE EASIER TO UNDERSTAND, AND NEW GETTERS CAN
ALWAYS BE ADDED FOR PULLING OUT NEW TYPES OF INFORMATION AS
ADDED -- THE IMPORTANT THING IS THAT RESULT OBJECTS LIKE
THIS WOULD NEED TO BE ABLE TO TRANSLATE THEMSELVES BACK INTO
SIMPLE OBJECTS THAT CAN BE UNDERSTOOD BY RESPONSEWRITERS SO
THAT THE VARIOUS REQUESTHANDLERS/RESPONSEWRITERS DON'T
*NEED* TO BE AWARE OF THEIR DETAILS.
> NEW JAVA API
> ------------
>
> KEY: SOLR-344
> URL:
HTTPS://ISSUES.APACHE.ORG/JIRA/BROWSE/SOLR-344
> PROJECT: SOLR
> ISSUE TYPE: IMPROVEMENT
> COMPONENTS: CLIENTS - JAVA, SEARCH, UPDATE
> AFFECTS VERSIONS: 1.3
> REPORTER: JONATHAN WOODS
> ATTACHMENTS: NEW JAVA API FOR SOLR.PDF
>
>
> THE CORE SOLR CODEBASE URGENTLY NEEDS TO EXPOSE A NEW
JAVA API DESIGNED FOR USE BY JAVA RUNNING IN SOLR'S JVM AND
ULTIMATELY BY CORE SOLR CODE ITSELF. THIS API MUST BE (I)
OBJECT-ORIENTED ('TYPESAFE'), (II) SELF-DOCUMENTING, (III)
AT THE RIGHT LEVEL OF GRANULARITY, (IV) DESIGNED
SPECIFICALLY TO EXPOSE THE VALUE WHICH SOLR ADDS OVER AND
ABOVE LUCENE.
> THIS IS AN URGENT ISSUE FOR TWO REASONS:
> - JAVA-SOLR INTEGRATIONS REPRESENT A USE-CASE WHICH IS
NEARLY AS IMPORTANT AS THE CORE SOLR USE-CASE IN WHICH
NON-JAVA CLIENTS INTERACT WITH SOLR OVER HTTP
> - A SIGNIFICANT PROPORTION OF QUESTIONS ON THE MAILING
LISTS ARE CLEARLY FROM PEOPLE WHO ARE ATTEMPTING SUCH
INTEGRATIONS RIGHT NOW.
> THIS POINT IN SOLR DEVELOPMENT - SOME WAY OUT FROM THE
1.3 RELEASE - MIGHT BE THE RIGHT TIME TO DO THE DEVELOPMENT
AND REFACTORING NECESSARY TO PRODUCE THIS API. WE CAN DO
THIS WITHOUT BREAKING ANY BACKWARD COMPATIBILITY FROM THE
POINT OF VIEW OF XML/HTTP AND JSON-LIKE CLIENTS, AND WITHOUT
ALTERING THE CORE SOLR ALGORITHMS WHICH MAKE IT SO
EFFICIENT. IF WE DO THIS WORK NOW, WE CAN SIGNIFICANTLY
SPEED UP THE SPREAD OF SOLR.
> EVENTUALLY, THIS API SHOULD BE PART OF CORE SOLR CODE,
NOT HIVED OFF INTO SOME SEPARATE PROJECT NOR IN A
NON-FIRST-CLASS PACKAGE SPACE. IT SHOULD BE CAPABLE OF
FORMING THE FOUNDATION OF ANY NEW SOLR DEVELOPMENT WHICH
DOESN'T NEED TO DELVE INTO LOW LEVEL CONSTRUCTS LIKE DOCSET
AND SO ON - AND ANY NEW DEVELOPMENT WHICH DOES NEED TO DO
JUST THAT SHOULD BE A CANDIDATE FOR INCORPORATION INTO THE
API AT THE SOME LEVEL. WHETHER OR NOT IT WILL EVER BE WORTH
RE-WRITING EXISTING CODE IS A MATTER OF OPINION; BUT THE
JAVA API SHOULD BE SUCH THAT IF IT HAD EXISTED BEFORE CORE
PLUG-INS WERE WRITTEN, IT WOULD HAVE BEEN NATURAL TO USE IT
WHEN WRITING THEM.
> I'VE ATTACHED A PDF WHICH MAKES THE CASE FOR THIS API.
APOLOGIES FOR DELIVERING IT AS AN ATTACHMENT, BUT I WANTED
TO EMBED PICS AND A BIT OF FORMATTING.
> I'LL UPDATE THIS ISSUE IN THE NEXT FEW DAYS TO GIVE A
PROTOTYPE OF THIS API TO SUGGEST WHAT IT MIGHT LOOK LIKE AT
PRESENT. THIS WILL BUILD ON THE WORK ALREADY DONE IN SOLRJ
AND SEARCHCOMPONENTS
(HTTPS://ISSUES.APACHE.ORG/JIRA/BROWSE/SOLR-281), AND WILL
BE A PATCH ON AN UP-TO-DATE REVISION OF SOLR TRUNK.
> [PS:
> 1. HAVING WRITTEN MOST OF THIS, I THEN PROPERLY LOOKED
AT SEARCHCOMPONENTS/SOLR-281 AND READ
HTTP://WWW.NABBLE.COM/FORUM/VIEWPOST.JTP?POST=11050274&F
RAMED=Y, WHICH SAYS MUCH THE SAME THING ALBEIT MORE QUICKLY!
AND WEEKS AGO, TOO. BUT THIS PROPOSAL IS ANGLED SLIGHTLY
DIFFERENTLY:
> - IT FOCUSSES ON THE VALUE OF CREATING AN API NOT ONLY
FOR INTERNAL SOLR CONSUMPTION, BUT FOR LOCAL JAVA CLIENTS
> - IT FOCUSSES ON DESIGNING A JAVA API WITHOUT
CONSTANTLY BEING HOBBLED BY HTTP-JAVA
> - IT'S SUGGESTING THAT THE SEARCHCOMPONENTS WORK SHOULD
RESULT IN A JAVA API WHICH CAN BE USED AS MUCH BY THIRD
PARTY JAVA AS BY RESPONSEBUILDER.
> 2. I'VE MADE SOME ATTEMPT TO ADDRESS HOSS'S POINT
(HTTP://WWW.NABBLE.COM/SEARCH-COMPONENTS-%28PLUGINS%29-TF389
8040.HTML#6551097579454875774) - THAT AN API LIKE THIS WOULD
NEED TO MAINTAIN ENOUGH STATE E.G. TO ALLOW AN INITIAL
SEARCH TO LATER BE FACETED, HIGHLIGHTED ETC WITHOUT GOING
BACK TO THE START EACH TIME - BUT CLEARLY THE PROOF OF THE
PUDDING WILL BE IN THE PROTOTYPE.
> 3. AGAIN, I'VE JUST DISCOVERED SOLR-212
(DIRECTSOLRCONNECTION). I THINK ALL MY COMMENTS ABOUT SOLRJ
APPLY TO THIS, USEFUL THOUGH IT CLEARLY IS.]
--
THIS MESSAGE IS AUTOMATICALLY GENERATED BY JIRA.
-
YOU CAN REPLY TO THIS EMAIL TO ADD A COMMENT TO THE ISSUE
ONLINE.
|
|
| Commented: (SOLR-344) New Java API |
  United States |
2007-09-05 23:53:33 |
[ https://issues.apache.org/jira/browse/SO
LR-344?page=com.atlassian.jira.plugin.system.issuetabpanels:
comment-tabpanel#action_12525287 ]
Jonathan Woods commented on SOLR-344:
-------------------------------------
Hoss - I take on board a lot of what you say, and I
appreciate the fact you even skimmed the PDF without
immediately accusing me of hubris! I'll come back to you in
a couple of days' time, when I've finished hacking my way
through old Lucene-based code I hoped to have (and maybe
should have) thrown away.
> New Java API
> ------------
>
> Key: SOLR-344
> URL: https:
//issues.apache.org/jira/browse/SOLR-344
> Project: Solr
> Issue Type: Improvement
> Components: clients - java, search, update
> Affects Versions: 1.3
> Reporter: Jonathan Woods
> Attachments: New Java API for Solr.pdf
>
>
> The core Solr codebase urgently needs to expose a new
Java API designed for use by Java running in Solr's JVM and
ultimately by core Solr code itself. This API must be (i)
object-oriented ('typesafe'), (ii) self-documenting, (iii)
at the right level of granularity, (iv) designed
specifically to expose the value which Solr adds over and
above Lucene.
> This is an urgent issue for two reasons:
> - Java-Solr integrations represent a use-case which is
nearly as important as the core Solr use-case in which
non-Java clients interact with Solr over HTTP
> - a significant proportion of questions on the mailing
lists are clearly from people who are attempting such
integrations right now.
> This point in Solr development - some way out from the
1.3 release - might be the right time to do the development
and refactoring necessary to produce this API. We can do
this without breaking any backward compatibility from the
point of view of XML/HTTP and JSON-like clients, and without
altering the core Solr algorithms which make it so
efficient. If we do this work now, we can significantly
speed up the spread of Solr.
> Eventually, this API should be part of core Solr code,
not hived off into some separate project nor in a
non-first-class package space. It should be capable of
forming the foundation of any new Solr development which
doesn't need to delve into low level constructs like DocSet
and so on - and any new development which does need to do
just that should be a candidate for incorporation into the
API at the some level. Whether or not it will ever be worth
re-writing existing code is a matter of opinion; but the
Java API should be such that if it had existed before core
plug-ins were written, it would have been natural to use it
when writing them.
> I've attached a PDF which makes the case for this API.
Apologies for delivering it as an attachment, but I wanted
to embed pics and a bit of formatting.
> I'll update this issue in the next few days to give a
prototype of this API to suggest what it might look like at
present. This will build on the work already done in Solrj
and SearchComponents (https:
//issues.apache.org/jira/browse/SOLR-281), and will be a
patch on an up-to-date revision of Solr trunk.
> [PS:
> 1. Having written most of this, I then properly looked
at SearchComponents/SOLR-281 and read http://www.nabble.com/forum/ViewPost.jtp?post
=11050274&framed=y, which says much the same thing
albeit more quickly! And weeks ago, too. But this proposal
is angled slightly differently:
> - it focusses on the value of creating an API not only
for internal Solr consumption, but for local Java clients
> - it focusses on designing a Java API without
constantly being hobbled by HTTP-Java
> - it's suggesting that the SearchComponents work should
result in a Java API which can be used as much by third
party Java as by ResponseBuilder.
> 2. I've made some attempt to address Hoss's point (http://www.nabble.com/se
arch-components-%28plugins%29-tf3898040.html#655109757945487
5774) - that an API like this would need to maintain
enough state e.g. to allow an initial search to later be
faceted, highlighted etc without going back to the start
each time - but clearly the proof of the pudding will be in
the prototype.
> 3. Again, I've just discovered SOLR-212
(DirectSolrConnection). I think all my comments about Solrj
apply to this, useful though it clearly is.]
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.
|
|
[1-4]
|
|