Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4914

Factor out core discovery and persistence logic

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 6.0
    • 4.5, 6.0
    • None
    • None

    Description

      Alan Woodward has done some work to refactor how core persistence works that we should work on going forward that I want to separate from a shorter-term tactical problem (See SOLR-4910).

      I'm attaching Alan's patch to this JIRA and we'll carry it forward separately from 4910.

      Attachments

        1. SOLR-4914.patch
          202 kB
          Alan Woodward
        2. SOLR-4914.patch
          200 kB
          Alan Woodward
        3. SOLR-4914.patch
          197 kB
          Alan Woodward
        4. SOLR-4914.patch
          183 kB
          Alan Woodward
        5. SOLR-4914.patch
          158 kB
          Alan Woodward
        6. SOLR-4914.patch
          125 kB
          Alan Woodward
        7. SOLR-4914.patch
          120 kB
          Alan Woodward
        8. SOLR-4914.patch
          38 kB
          Erick Erickson

        Issue Links

          Activity

            romseygeek Alan Woodward added a comment -

            Thanks for opening this, Erick.

            I think it makes sense to combine persistence and discovery, as they're basically two sides of the same coin. So pre Solr 4.3, we loaded cores from solr.xml and persisted them back there. Now we allow Solr to crawl the filesystem looking for core definition files and we persist these.

            I'm thinking the combined interface should look something like this:

            public interface CoreDiscoverer {
            
                public List<CoreDescriptor> discover();
            
                public void persist(CoreDescriptor cd);
                public void delete(CoreDescriptor cd);
            
            }
            

            This also allows us to make discovery and persistence completely pluggable. If I want to store my core definitions in an external DB, or a plain text file, or whatever, I implement a PlainTextCoreDiscoverer and drop it onto the classpath, and refer to it from solr.xml.

            romseygeek Alan Woodward added a comment - Thanks for opening this, Erick. I think it makes sense to combine persistence and discovery, as they're basically two sides of the same coin. So pre Solr 4.3, we loaded cores from solr.xml and persisted them back there. Now we allow Solr to crawl the filesystem looking for core definition files and we persist these. I'm thinking the combined interface should look something like this: public interface CoreDiscoverer { public List<CoreDescriptor> discover(); public void persist(CoreDescriptor cd); public void delete(CoreDescriptor cd); } This also allows us to make discovery and persistence completely pluggable. If I want to store my core definitions in an external DB, or a plain text file, or whatever, I implement a PlainTextCoreDiscoverer and drop it onto the classpath, and refer to it from solr.xml.
            romseygeek Alan Woodward added a comment -

            Patch with my latest status. It won't compile yet, as I haven't updated all tests to reflect the new API. Will work on that next.

            This combines core discovery and persistence into the CoresListPersistor interface. All persistence logic is removed from CoreContainer and ConfigSolr and put into the two implementing classes of CorePropertiesPersistor and SolrXMLPersistor.

            CoreDescriptor is tidied up a bit, and made effectively immutable (would be nicer to make it really immutable, maybe with an ImmutableProperties class). The original pre-substitution parameters are stored as well as the values after substitution, which makes persistence a lot easier (you just read from the original values).

            Solr.xml persistence is also made a lot simpler, by just storing everything around the <cores/> tag as a flat string, and only updating the <core> tags. So you don't need to remember to add new solr.xml parameters to core persistence logic any more, and things like comments will be preserved.

            This is a pretty big patch, and there's still a fair amount to do, but I'd be grateful for some preliminary reviews. I think it simplifies the whole core discovery/persistence logic a lot.

            romseygeek Alan Woodward added a comment - Patch with my latest status. It won't compile yet, as I haven't updated all tests to reflect the new API. Will work on that next. This combines core discovery and persistence into the CoresListPersistor interface. All persistence logic is removed from CoreContainer and ConfigSolr and put into the two implementing classes of CorePropertiesPersistor and SolrXMLPersistor. CoreDescriptor is tidied up a bit, and made effectively immutable (would be nicer to make it really immutable, maybe with an ImmutableProperties class). The original pre-substitution parameters are stored as well as the values after substitution, which makes persistence a lot easier (you just read from the original values). Solr.xml persistence is also made a lot simpler, by just storing everything around the <cores/> tag as a flat string, and only updating the <core> tags. So you don't need to remember to add new solr.xml parameters to core persistence logic any more, and things like comments will be preserved. This is a pretty big patch, and there's still a fair amount to do, but I'd be grateful for some preliminary reviews. I think it simplifies the whole core discovery/persistence logic a lot.
            romseygeek Alan Woodward added a comment -

            Same patch as before, updated to trunk to include all the SOLR-4910 changes. Most of which I'm nuking

            romseygeek Alan Woodward added a comment - Same patch as before, updated to trunk to include all the SOLR-4910 changes. Most of which I'm nuking
            erickerickson Erick Erickson added a comment -

            Just make sure all the tests continue to pass

            erickerickson Erick Erickson added a comment - Just make sure all the tests continue to pass
            romseygeek Alan Woodward added a comment -

            Latest status. There are still some failing tests, and I need to add some new generic persistence tests.

            romseygeek Alan Woodward added a comment - Latest status. There are still some failing tests, and I need to add some new generic persistence tests.
            romseygeek Alan Woodward added a comment -

            Nearly there. ShardSplitTest is failing consistently, and I still need to work out what's going on there. There's also a nocommit in there concerning what to do when unloading a core discovered via a core.properties file - should we delete the file, or rename it? I think renaming it is probably the most user-friendly thing to do, as then you just have to rename it back again to make the core available next time you start up.

            romseygeek Alan Woodward added a comment - Nearly there. ShardSplitTest is failing consistently, and I still need to work out what's going on there. There's also a nocommit in there concerning what to do when unloading a core discovered via a core.properties file - should we delete the file, or rename it? I think renaming it is probably the most user-friendly thing to do, as then you just have to rename it back again to make the core available next time you start up.
            erickerickson Erick Erickson added a comment -

            bq: I think renaming it is probably the most user-friendly thing to do

            +1 unless deleteIndex=true in which case nuke it all?

            erickerickson Erick Erickson added a comment - bq: I think renaming it is probably the most user-friendly thing to do +1 unless deleteIndex=true in which case nuke it all?
            romseygeek Alan Woodward added a comment -

            I think if deleteIndex=true then it will already have been nuked.

            romseygeek Alan Woodward added a comment - I think if deleteIndex=true then it will already have been nuked.
            romseygeek Alan Woodward added a comment -

            Final patch. All tests pass.

            There are a couple of bits here which could do with following up:
            1) CoreAdminHandler.handleCreateAction is doing a bunch of stuff that really ought to be moved inside CoreContainer.
            2) I'm a bit concerned by what will happen if somebody creates two cores using the same instance dir when in core discovery mode - the second core's core.properties file will overwrite the first. This is only an issue in core discovery mode, though, as storing core definitions in solr.xml can deal with this fine.

            romseygeek Alan Woodward added a comment - Final patch. All tests pass. There are a couple of bits here which could do with following up: 1) CoreAdminHandler.handleCreateAction is doing a bunch of stuff that really ought to be moved inside CoreContainer. 2) I'm a bit concerned by what will happen if somebody creates two cores using the same instance dir when in core discovery mode - the second core's core.properties file will overwrite the first. This is only an issue in core discovery mode, though, as storing core definitions in solr.xml can deal with this fine.
            erickerickson Erick Erickson added a comment -

            A massive patch, so I haven't looked in great detail, but few comments:

            1> CoreAdminCreateDiscoverTest, there has been a change from testing that specific properties are preserved to just checking the number of properties. That test was written that way specifically because instanceDir was being preserved in the core.properties file. The way it is if we inadvertently preserved instanceDir but left out, say dataDir the test would pass. Can you put that back?

            2> TestCoreContainer.testPersist is removed, what replaces it?

            Next step: Removing persistence in 5.x!

            About your point 2, creating two cores with same instance dir. Last I knew there was a check at startup time to report this, but I agree it's a bit of a trap.

            erickerickson Erick Erickson added a comment - A massive patch, so I haven't looked in great detail, but few comments: 1> CoreAdminCreateDiscoverTest, there has been a change from testing that specific properties are preserved to just checking the number of properties. That test was written that way specifically because instanceDir was being preserved in the core.properties file. The way it is if we inadvertently preserved instanceDir but left out, say dataDir the test would pass. Can you put that back? 2> TestCoreContainer.testPersist is removed, what replaces it? Next step: Removing persistence in 5.x! About your point 2, creating two cores with same instance dir. Last I knew there was a check at startup time to report this, but I agree it's a bit of a trap.
            romseygeek Alan Woodward added a comment -

            re CoreAdminCreateDiscoverTest, the lines immediately prior to the size check are checks for the four properties in question, so that's covered.

            re testPersist, that's covered by TestSolrXMLPersistence and TestCoreDiscovery. But some more tests here would be good, I agree...

            re cores sharing instance dir, startup time will be too late to detect the problem here. Creating a new core via the CoreAdminHandler will overwrite any existing core.properties file. Maybe the way to deal with this is to call locator.persist() before we call create(), and throw an exception from CorePropertiesLocator. Will try and work something up.

            I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now.

            romseygeek Alan Woodward added a comment - re CoreAdminCreateDiscoverTest, the lines immediately prior to the size check are checks for the four properties in question, so that's covered. re testPersist, that's covered by TestSolrXMLPersistence and TestCoreDiscovery. But some more tests here would be good, I agree... re cores sharing instance dir, startup time will be too late to detect the problem here. Creating a new core via the CoreAdminHandler will overwrite any existing core.properties file. Maybe the way to deal with this is to call locator.persist() before we call create(), and throw an exception from CorePropertiesLocator. Will try and work something up. I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now.
            erickerickson Erick Erickson added a comment -

            bq: I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now.

            I sure hope so, it was pretty ugly for a long time.

            But the deprecation also has another goal, moving all the core definitions out to discovery mode rather than this current hybrid. It's not clear to me if we need to continue persisting solr.xml if you never have to define cores there. But that's a discussion for another day I think.

            Thans for the effort here!

            Erick

            erickerickson Erick Erickson added a comment - bq: I also think that this may remove the need to deprecate persistence - it's a lot more maintainable now. I sure hope so, it was pretty ugly for a long time. But the deprecation also has another goal, moving all the core definitions out to discovery mode rather than this current hybrid. It's not clear to me if we need to continue persisting solr.xml if you never have to define cores there. But that's a discussion for another day I think. Thans for the effort here! Erick
            romseygeek Alan Woodward added a comment -

            Adds a test that you can't add two cores that share an instanceDir if you're using the CorePropertiesLocator.

            re solr.xml persistence, the cores list is the only thing that changes, everything else is immutable. Which is why this JIRA doesn't bother trying to reconstruct complicated XML from the config, it just keeps a copy of the existing XML and replaces the cores list on persistence. Much easier

            romseygeek Alan Woodward added a comment - Adds a test that you can't add two cores that share an instanceDir if you're using the CorePropertiesLocator. re solr.xml persistence, the cores list is the only thing that changes, everything else is immutable. Which is why this JIRA doesn't bother trying to reconstruct complicated XML from the config, it just keeps a copy of the existing XML and replaces the cores list on persistence. Much easier
            romseygeek Alan Woodward added a comment -

            I'm going to commit this tomorrow unless anybody objects, to trunk and 4.x. It's kind of a big change to get into 4.4 though.

            romseygeek Alan Woodward added a comment - I'm going to commit this tomorrow unless anybody objects, to trunk and 4.x. It's kind of a big change to get into 4.4 though.
            erickerickson Erick Erickson added a comment -

            -1 on trying to put it into 4.4

            erickerickson Erick Erickson added a comment - -1 on trying to put it into 4.4
            romseygeek Alan Woodward added a comment -

            Actually I think SOLR-5029 needs to be fixed before I commit this.

            romseygeek Alan Woodward added a comment - Actually I think SOLR-5029 needs to be fixed before I commit this.
            romseygeek Alan Woodward added a comment -

            New patch, taking into account SOLR-5029 as well. All tests pass, committing to trunk and 4x (but not 4.4!) shortly.

            romseygeek Alan Woodward added a comment - New patch, taking into account SOLR-5029 as well. All tests pass, committing to trunk and 4x (but not 4.4!) shortly.

            Commit 1502276 from romseygeek
            [ https://svn.apache.org/r1502276 ]

            SOLR-4914: Factor out core discovery and persistence logic

            jira-bot ASF subversion and git services added a comment - Commit 1502276 from romseygeek [ https://svn.apache.org/r1502276 ] SOLR-4914 : Factor out core discovery and persistence logic

            Commit 1502280 from romseygeek
            [ https://svn.apache.org/r1502280 ]

            SOLR-4914: Factor out core discovery and persistence logic

            jira-bot ASF subversion and git services added a comment - Commit 1502280 from romseygeek [ https://svn.apache.org/r1502280 ] SOLR-4914 : Factor out core discovery and persistence logic

            Commit 1502468 from romseygeek
            [ https://svn.apache.org/r1502468 ]

            SOLR-4914: Close OutputStreamWriter properly, use System.getProperty("line.separator") instead of \n

            Fixes Windows test failures.

            jira-bot ASF subversion and git services added a comment - Commit 1502468 from romseygeek [ https://svn.apache.org/r1502468 ] SOLR-4914 : Close OutputStreamWriter properly, use System.getProperty("line.separator") instead of \n Fixes Windows test failures.

            Commit 1502469 from romseygeek
            [ https://svn.apache.org/r1502469 ]

            SOLR-4914: Close OutputStreamWriter, use platform-independent newlines in tests

            jira-bot ASF subversion and git services added a comment - Commit 1502469 from romseygeek [ https://svn.apache.org/r1502469 ] SOLR-4914 : Close OutputStreamWriter, use platform-independent newlines in tests

            Commit 1502481 from romseygeek
            [ https://svn.apache.org/r1502481 ]

            SOLR-4914: Use Properties.store(OutputStream, String) for back compatibility

            jira-bot ASF subversion and git services added a comment - Commit 1502481 from romseygeek [ https://svn.apache.org/r1502481 ] SOLR-4914 : Use Properties.store(OutputStream, String) for back compatibility

            Commit 1502483 from romseygeek
            [ https://svn.apache.org/r1502483 ]

            SOLR-4914: Use Properties.store(OutputStream, String) for back compatibility

            jira-bot ASF subversion and git services added a comment - Commit 1502483 from romseygeek [ https://svn.apache.org/r1502483 ] SOLR-4914 : Use Properties.store(OutputStream, String) for back compatibility

            Commit 1502507 from romseygeek
            [ https://svn.apache.org/r1502507 ]

            SOLR-4914: Close input streams as well

            jira-bot ASF subversion and git services added a comment - Commit 1502507 from romseygeek [ https://svn.apache.org/r1502507 ] SOLR-4914 : Close input streams as well

            Commit 1502508 from romseygeek
            [ https://svn.apache.org/r1502508 ]

            SOLR-4914: Close input streams as well

            jira-bot ASF subversion and git services added a comment - Commit 1502508 from romseygeek [ https://svn.apache.org/r1502508 ] SOLR-4914 : Close input streams as well

            People

              romseygeek Alan Woodward
              erickerickson Erick Erickson
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: