Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-15351

Research possibility of having caching layer on top of RocksDB partitions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha2
    • 3.0.0-alpha3
    • None
    • Release Notes Required

    Description

      In Ignite 2.x there's a concept of "Data Regions", which is basically a set of fixed-sized in-memory caches that store data for a number of cache groups (let's ignore system region and similar stuff for now). It is very convenient and represents a core design feature in Apache Ignite - In-Memory Database.

      Currently, Page Memory subsystem is not yet ported to Ignite 3.x codebase. Instead, there's an implementation based on RocksDB database to store data persistently.

      But, this implementation is very simple and naive. There's no notion of in-memory cache across multiple tables, meaning that it can't be called an In-Memory Database. We should investigate ways to add this concept back on top of RocksDB implementation.

      There are several things to investigate here:

      • how do you set up rocksdb properly and control its memory consumption - we should allow some configuration and a meaningful set of defaults;
      • how do you put a cache on top of several rocksdb instances. This is actually pretty easy, just use "org.rocksdb.Options#setRowCache(org.rocksdb.Cache)", it has LRU and Clock implementations. A way to configure it is still required;
      • how do we introduce data regions into our system? I see something like this:
        • list of regions is either a node or cluster configuration;
        • name of the region is a property of every individual table or table group (or whatever else we'll be having).

      Last proposition is a bit tricky, cause it won't look like "create table with rocks engine with Clock cache...", it would look like "create table in region Foo". We have to conceptualize all these things and come up with proper naming at least.

      Update 1

      • the only way to control rocksdb memory usage is to have a single DB instance. For every table there will be several column families:
        • one for table meta;
        • one for every partition;
        • one for every index;
      • data regions are a configuration of every individual node. They will have name, type and some other settings. The way tables chose the region remains to be defined;
      • there have to be common rocksdb settings outside of region settings, like mem table size, wal settings, etc.

      Update 2

      • actually, there is a way to have a shared memory manager for several instances

      Attachments

        Issue Links

          Activity

            People

              ibessonov Ivan Bessonov
              ibessonov Ivan Bessonov
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 11h 50m
                  11h 50m