[IGNITE-16102] Store all RocksDB partitions in a single column family. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0-alpha3
Fix Version/s: 3.0.0-alpha5
Component/s: None
Labels:
- iep-74
- ignite-3

Description

Current storage implementation puts each partition in its own column family. This effectively means that every partition lives in it's own database, sharing only WAL and some in-memory resources. Given that each column family has multiple files for LSM trees, the amount of opened file descriptors is bigger than it needs to be.

Now, the idea is to have a single column family for partitions within a table. And we should think of possibility of storing several tables in the same RocksDB instance, for similar reasons. You can think about is as of cache groups in Ignite 2.x.

There's also an "optimization" to be implemented that is missing in code - using key hashes as prefixes.

What should be implemented:

First of all, code will be heavily refactored. This will lead to simplifications in many places.

Otherwise, I see the following list of goals to achieve:

current implementation allows to derive the list of partitions from the list of column families. This won't be possible, I suggest storing this list explicitly in "meta" CF, in any format that'll be convenient during the implementation
there should be a way of having compact "tableId" representation. IgniteUUID or even UUID is too much I think, but it might work as a basis. This problem should be discussed
binary representation for keys should now include following information:
- tableId - fixed-length set of bytes to be used as a prefix
- partitionId - 2 bytes that will follow the tableId. This layout will allow making range queries for specific partitions of specific tables
- key hash - 4 bytes. This one is required to optimize comparison time for keys. Generally speaking, it's safe to assume that hashes will be mostly different for different keys, meaning that hashes will be enough to determine keys inequality
- actual key payload goes after all these prefixes

Attachments

Issue Links

causes

IGNITE-16370 Use precomputed hash of a SearchRow in storage

Resolved

links to

GitHub Pull Request #562

Activity

People

Assignee:: Aleksandr Polovtsev

Reporter:: Ivan Bessonov

Reviewer:: Ivan Bessonov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Dec/21 13:03

Updated:: 24/Jan/22 11:49

Resolved:: 24/Jan/22 11:49

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

1h 10m