Description
The current 3.7 and trunk versions are leaking native memory while running Kafka streams over several hours. This will likely kill any real workload over time, so this should be treated as a blocker bug for 3.7.
This is discovered in a long-running soak test. Attached is the memory consumption, which steadily approaches 100% and then the JVM is killed.
Rerunning the same test with jemalloc native memory profiling, we see these allocated objects after a few hours:
(jeprof) top Total: 13283138973 B 10296829713 77.5% 77.5% 10296829713 77.5% rocksdb::port::cacheline_aligned_alloc 2487325671 18.7% 96.2% 2487325671 18.7% rocksdb::BlockFetcher::ReadBlockContents 150937547 1.1% 97.4% 150937547 1.1% rocksdb::lru_cache::LRUHandleTable::LRUHandleTable 119591613 0.9% 98.3% 119591613 0.9% prof_backtrace_impl 47331433 0.4% 98.6% 105040933 0.8% rocksdb::BlockBasedTable::PutDataBlockToCache 32516797 0.2% 98.9% 32516797 0.2% rocksdb::Arena::AllocateNewBlock 29796095 0.2% 99.1% 30451535 0.2% Java_org_rocksdb_Options_newOptions 18172716 0.1% 99.2% 20008397 0.2% rocksdb::InternalStats::InternalStats 16032145 0.1% 99.4% 16032145 0.1% rocksdb::ColumnFamilyDescriptorJni::construct 12454120 0.1% 99.5% 12454120 0.1% std::_Rb_tree::_M_insert_unique
The first hypothesis is that this is caused by the leaking `Options` object introduced in this line:
Introduced in this PR: https://github.com/apache/kafka/pull/14852
Attachments
Attachments
Issue Links
- links to