Description
CloudSolrServer (and LBHttpSolrServer) do not allow the user to set underlying HttpSolrServer and HttpClient settings without arcane java enchantments.
Attachments
Attachments
- SOLR-4715-incomplete.patch
- 21 kB
- Shawn Heisey
- SOLR-4715-incomplete.patch
- 23 kB
- Shawn Heisey
- SOLR-4715-httpclient.patch
- 8 kB
- Shalin Shekhar Mangar
- SOLR-4715-httpclient.patch
- 11 kB
- Shalin Shekhar Mangar
Issue Links
- is blocked by
-
SOLR-4919 Allow setting ResponseParser and RequestWriter on LBHttpSolrServer
- Resolved
Activity
Well the reason i am looking for xml response form solrj is, i am having a system which understands xml and json,off course i can use javabin, but in that case i need to manually convert the result to xml before feeding to my system which solr already provides in xml/json. since solr provides xml response i designed my system to directly take xml/json response and hence.
My initial inclination is to NOT provide additional constructors, but to provide a number of getters and setters. In addition to methods for setting timeouts and common httpclient properties, I would include getHttpClient and possibly something with a name like getHttpSolrServer or getInnerSolrServer. For CloudSolrServer, most of these new methods would just send/request the same information to/from LBHttpSolrServer.
Should I change my approach? I haven't written any code yet.
hupadhyay, the following code *MIGHT* allow you to change the response parser back to XML before this issue is implemented. I have not tested this, and I would be very curious about whether it works for you. It also changes a couple of HttpClient parameters, but you could remove those two lines.
import org.apache.http.client.HttpClient; import org.apache.solr.client.solrj.ResponseParser; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.SolrServerException; import org.apache.solr.client.solrj.impl.CloudSolrServer; import org.apache.solr.client.solrj.impl.HttpClientUtil; import org.apache.solr.client.solrj.impl.LBHttpSolrServer; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.common.params.ModifiableSolrParams; public class TestStuff { void test() throws MalformedURLException { String zkHost = ""; ModifiableSolrParams params = new ModifiableSolrParams(); params.set(HttpClientUtil.PROP_MAX_CONNECTIONS, 1000); params.set(HttpClientUtil.PROP_MAX_CONNECTIONS_PER_HOST, 200); HttpClient client = HttpClientUtil.createClient(params); ResponseParser parser = new XMLResponseParser(); LBHttpSolrServer lbServer = new LBHttpSolrServer(client, parser, "http://localhost/solr"); lbServer.removeSolrServer("http://localhost/solr"); SolrServer server = new CloudSolrServer(zkHost, lbServer); } }
I have tried some minimal testing with this code for setting the response parser and httpclient params, and it appears to work.
I've run into a challenge in creating a patch for this issue. The response parser object in LBHttpSolrServer is final. If there's a really good reason for this object to be final, then creating setParser and setRequestWriter methods could be really challenging.
After a little bit of thought, I'm thinking the reason the ResponseParser object is final is so that there are no thread visibility problems, because it can't ever change. The following ideas would require removing that final modifier and adding an object for a shared RequestWriter.
For CloudSolrServer: If no requests have been processed yet, then the LBHttpSolrServer object will not yet have any internal HttpSolrServer objects, so passing through setParser and setRequestWriter calls should be perfectly safe. We can block these methods once the first request gets processed, or we can just pass them through and rely on the following:
For LBHttpSolrServer, we can do one of three things with setParser and setRequestWriter if there are any ServerWrapper objects (and therefore HttpSolrServer objects): 1) Throw an exception. 2) Ignore the request. 3) Make the requested change on all HttpSolrServer objects.
Haven't looked at the code, so this may be from left field. But I'm not enthused about setters that work only some of the time, would it be possible to make a c'tor that took a RequestWriter and Parser? Note that I'm also not enthused about having 15 different c'tors, but it might be the lesser of two evils.
Of the three, I think (1) is the way to go if you must choose between them. (2) fails mysteriously and (3) may be tricky, only worth doing if you can come up with a use case where it's desirable to change mid-stream and it would be too expensive to create a new CloudSolrServer.
FWIW
erickerickson, that was my thought too. I don't like option 2, even if it spits out a WARN log, but it's one way of handling things, so I mentioned it. I think we would struggle to find a valid use case for changing parser/writer midstream. The only good time to set these things is at or right after object creation.
Due to how the existing ctors work on the LB object, it seems like a good option to add one more ctor with both parser and writer, but like you, I want to avoid ctor explosion. Perhaps I could deprecate the one with just the parser and remove it in trunk. I'd need to make sure that all ctors will work with null values so users can override one but not the other, and log something at INFO (or maybe DEBUG) saying that the default was chosen for null input.
My instinct after this discussion is to throw an exception at the LB level if you attempt to change parser/writer after ServerWrapper objects have been created. This means that the signature for the setter methods will include an exception, but SolrServer doesn't have these methods, so there are no inheritance problems with this. The javadocs will need to be very good.
Getting into the code, here's something to note: The javadoc on HttpSolrServer#setParser indicates that it is not thread safe, so I think that throwing the exception is the right thing to do.
In-progress patch. Still a lot left to do, but this is the general direction I'm going. It compiles, but at least one test fails as a direct result.
I appear to have done something really wrong. I'll have a new patch up soon. The following applies to the latest patch, but I think that it also applies to the one that's already attached:
The first part of LBHttpSolrServerTest#testLBHttpSolrServerHttpClientResponseParserStringArray fails with my patch, because when a null value is passed in, the object will create the parser/writer rather than assign the null. Is there ever a valid reason to have a null parser?
A lot of other solrj tests fail, mostly due to parsing errors in XMLResponseParser, but also due sometimes to NullPointerException.
The patch also appears to be responsible for a failure in the core test named TestCloudManagedSchema:
[junit4:junit4] > Throwable #1: java.lang.RuntimeException: java.io.UnsupportedEncodingException: text/plain; charset=ISO-8859-1 [junit4:junit4] > at __randomizedtesting.SeedInfo.seed([6B54507467721A2F:EAB2DE6C102D7A13]:0) [junit4:junit4] > at org.apache.solr.schema.TestCloudManagedSchema$RawResponseParser.processResponse(TestCloudManagedSchema.java:104) [junit4:junit4] > at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:440) [junit4:junit4] > at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:213) [junit4:junit4] > at org.apache.solr.schema.TestCloudManagedSchema.getFileContentFromZooKeeper(TestCloudManagedSchema.java:87) [junit4:junit4] > at org.apache.solr.schema.TestCloudManagedSchema.doTest(TestCloudManagedSchema.java:67) [junit4:junit4] > at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:815) [junit4:junit4] > at java.lang.Thread.run(Thread.java:722) [junit4:junit4] > Caused by: java.io.UnsupportedEncodingException: text/plain; charset=ISO-8859-1 [junit4:junit4] > at sun.nio.cs.StreamDecoder.forInputStreamReader(StreamDecoder.java:71) [junit4:junit4] > at java.io.InputStreamReader.<init>(InputStreamReader.java:100) [junit4:junit4] > at org.apache.commons.io.IOUtils.copy(IOUtils.java:1435) [junit4:junit4] > at org.apache.commons.io.IOUtils.toString(IOUtils.java:585) [junit4:junit4] > at org.apache.solr.schema.TestCloudManagedSchema$RawResponseParser.processResponse(TestCloudManagedSchema.java:102) [junit4:junit4] > ... 45 more
Taking a look at the actual patch, I think I've discovered my basic problem. Apparently when you want to stream results, you use a null parser – so there is indeed a use for that.
The null parser wasn't the reason for all the failures. It was me trying to replace the deprecated HttpComponents call (EntityUtils.getContentCharSet) that gets the character set from the response.
Updated patch, still incomplete.
Is there a valid use case for having a null parser? The test for LBHttpSolrServer implies that there is.
I had the same realization while trying to update flux (clojure solr client) for cloud features. How about a builder class/methods to avoid the explosion of constructor types? We can do something about this in 5.0
An example:
new CloudSolrServer.Builder().zkConnectString(zkHosts).httpClient(client).responseParser(parser).requestWriter(writer).defaultCollection(collection).build();
Just add a constructor which accepts HttpClient as a param . It'll be consistent with the other implementations
Here's a patch which adds constructors for HttpClient. Setting other properties such as request writers and response parsers has already been done with SOLR-5223.
Merged some javadoc improvements in LBHttpSolrServer that Shawn had in the other patches.
Commit 1632049 from shalin@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1632049 ]
SOLR-4715: Add CloudSolrServer constructors which accept a HttpClient instance
Commit 1632050 from shalin@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1632050 ]
SOLR-4715: Add CloudSolrServer constructors which accept a HttpClient instance
Oops, I committed wrong javadocs for the setParser method in LbHttpSolrServer. Removing them now.
Commit 1632164 from shalin@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1632164 ]
SOLR-4715: Fixing precommit and removing incorrect javadocs
Commit 1632165 from shalin@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1632165 ]
SOLR-4715: Fixing precommit and removing incorrect javadocs
Is there a reason that you can't use javabin? It is more efficient than XML and should work perfectly for any SolrJ code where you are using java objects.
There could be an aspect of this that I am not seeing, but CloudSolrServer really has no need for XML responses at this time. It could be argued that allowing for a different response parser is a good idea, so I'll leave this issue open and let someone more experienced decide whether it should be closed or pursued.
The only compelling reason I know about to use XML responses in SolrJ is when you need to connect to a Solr version running a different version of javabin. The javabin version changed from 1 to 2 when Solr 3.1.0 was released, in order to fix some bugs. I'm not aware of any reason at this time to make javabin version 3.
CloudSolrServer only exists in SolrJ 4.0 and later, and can only be used with Solr 4.0 and later, so there are no javabin issues. Due to how fast SolrCloud (and its Zookeeper integration) is changing, I would not recommend using mismatched Solr and SolrJ versions with CloudSolrServer. Although cross-version compatibility is a strong goal with SolrJ, SolrCloud is under pretty heavy development.
LBHttpSolrServer is a little different. It's been around forever, so the javabin incompatibility could be a problem. Thankfully, it has a contructor that will let you build it with a different response parser:
http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/impl/LBHttpSolrServer.html#LBHttpSolrServer%28org.apache.http.client.HttpClient,%20org.apache.solr.client.solrj.ResponseParser,%20java.lang.String...%29