Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
While benchmarking for performance, we saw a sharp change in the graphs:
https://issues.apache.org/jira/browse/SOLR-16525?focusedCommentId=17676725&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17676725
Turns out there was a commit (SOLR-16414) that escaped all testing and caused a regression where restarted nodes didn't have the replicas coming up as active.
This affects 9.1 release, so opening a new JIRA issue to track it.
Here's how to reproduce it:
git clone https://github.com/fullstorydev/solr-bench cd solr-bench # prerequisites on ubuntu: sudo apt install openjdk-11-jdk sudo apt install wget unzip zip ant ivy lsof git netcat make maven jq # this is a patch to comment out the cleanup/final shutdown wget https://termbin.com/yuu95 git apply yuu95 mvn clean compile assembly:single ./cleanup.sh && ./stress.sh -c aa4f3d98ab19c201e7f3c74cd14c99174148616d suites/stress-facets-local.json
If the 95th percentile is <10 or so, we have a problem. It should be >300 or so. Since, we disabled cleanup, we can hit http://localhost:50000/solr/ to open Solr UI. In this case, I see that querying to the ecommerce-events collection shows shard2 is down.