Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run
The tool would support
- For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index -> importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much
- For SegementNodeStore setup it would be possible to index on a cloned setup and then provide a way to copy the index back
Future Enhancements
- Resumable tarversal - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (
OAK-5833) - Multithreaded traversal - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged
Attachments
Issue Links
- is blocked by
-
OAK-6117 Enable lucene indexing via oak-run
- Closed
-
OAK-6409 Oak-run indexing: improved (user friendly) output
- Closed
- is related to
-
OAK-3680 Partial re-index from last known good state
- Open
-
OAK-6246 Support for out of band indexing with read only access to NodeStore
- Closed
-
OAK-2063 Index creation: interruption resilience
- Resolved
-
OAK-7847 complete "Indexing tooling via oak-run" in 1.8
- Closed
- relates to
-
OAK-6453 Script to import oak-run generated indexing to older Oak setup
- Resolved