Description
The test-randomization recently-added as a part of SOLR-11507 has caused CloudSolrClientTest.testRetryUpdatesWhenClusterStateIsStale to fail semi-regularly on master. The test only succeeds for me on 3 out of 10 test runs. The test fails with the message:
[junit4] 2> 14848 ERROR (TEST-CloudSolrClientTest.testRetryUpdatesWhenClusterStateIsStale-seed#[64E89FBB977E15AA]) [ ] o.a.s.c.s.i.CloudSolrClient Request to collection [stale_state_test_col] failed due to (404) org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:38925/solr/stale_state_test_col_shard1_replica_n1: Expected mime type application/octet-stream but got text/html. <html> [junit4] 2> <head> [junit4] 2> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> [junit4] 2> <title>Error 404 </title> [junit4] 2> </head> [junit4] 2> <body> [junit4] 2> <h2>HTTP ERROR: 404</h2> [junit4] 2> <p>Problem accessing /solr/stale_state_test_col_shard1_replica_n1/update. Reason: [junit4] 2> <pre> Can not find: /solr/stale_state_test_col_shard1_replica_n1/update</pre></p> [junit4] 2> <hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.20.v20170531</a><hr/> [junit4] 2> </body> [junit4] 2> </html> [junit4] 2> , retry? 0 [junit4] 2> 14851 INFO (TEST-CloudSolrClientTest.testRetryUpdatesWhenClusterStateIsStale-seed#[64E89FBB977E15AA]) [ ] o.a.s.SolrTestCaseJ4 ###Ending testRetryUpdatesWhenClusterStateIsStale [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=CloudSolrClientTest -Dtests.method=testRetryUpdatesWhenClusterStateIsStale -Dtests.seed=64E89FBB977E15AA -Dtests.slow=true -Dtests.locale=es-VE -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 5.86s | CloudSolrClientTest.testRetryUpdatesWhenClusterStateIsStale <<< [junit4] > Throwable #1: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:38925/solr/stale_state_test_col_shard1_replica_n1: Expected mime type application/octet-stream but got text/html. <html> [junit4] > <head> [junit4] > <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> [junit4] > <title>Error 404 </title> [junit4] > </head> [junit4] > <body> [junit4] > <h2>HTTP ERROR: 404</h2> [junit4] > <p>Problem accessing /solr/stale_state_test_col_shard1_replica_n1/update. Reason: [junit4] > <pre> Can not find: /solr/stale_state_test_col_shard1_replica_n1/update</pre></p> [junit4] > <hr /><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.3.20.v20170531</a><hr/> [junit4] > </body> [junit4] > </html> [junit4] > at __randomizedtesting.SeedInfo.seed([64E89FBB977E15AA:D0D9075374976386]:0) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:607) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) [junit4] > at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) [junit4] > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483) [junit4] > at org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:559) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1016) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:883) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:816) [junit4] > at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194) [junit4] > at org.apache.solr.client.solrj.request.UpdateRequest.commit(UpdateRequest.java:233) [junit4] > at org.apache.solr.client.solrj.impl.CloudSolrClientTest.testRetryUpdatesWhenClusterStateIsStale(CloudSolrClientTest.java:844) [junit4] > at java.lang.Thread.run(Thread.java:748)
After some digging, it looks like the issue is that testRetryUpdatesWhenClusterStateIsStale implicitly relies on directUpdatesToLeadersOnly, and parallelUpdates, which are now randomized when using SolrTestCaseJ4's CloudSolrClient creation helpers.
Attached is a patch ensuring that testRetryUpdatesWhenClusterStateIsStale insists on those two update-related properties, instead of taking the randomized defaults. Without the patch, this test passes maybe 5 out of twenty times. With the patch, it passes consistently (20 out of 20 runs).