Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
Normal
Description
During our upgrade of Cassandra from version 2.0.7 to 2.1.2 we experienced a serious problem regarding the setting unchecked_tombstone_compaction in combination with leveled compaction strategy.
In order to prevent tombstone-threshold-warnings we activated the setting for a specific table after the upgrade. Some time after that we observed new errors in our log files:
INFO [CompactionExecutor:184] 2014-12-11 12:36:06,597 CompactionTask.java:136 - Compacting [SSTableReader(path='/data/cassandra/data/system/compactions_in_progress/system-compactions_in_progress-ka-1848-Data.db'), SSTableReader(path='/ data/cassandra/data/system/compactions_in_progress/system-compactions_in_progress-ka-1847-Data.db'), SSTableReader(path='/data/cassandra/data/system/compactions_in_progress/system-compactions_in_progress-ka-1845-Data.db'), SSTableReader (path='/data/cassandra/data/system/compactions_in_progress/system-compactions_in_progress-ka-1846-Data.db')] ERROR [CompactionExecutor:183] 2014-12-11 12:36:06,613 CassandraDaemon.java:153 - Exception in thread Thread[CompactionExecutor:183,1,main] java.lang.AssertionError: /data/cassandra/data/metrigo_prod/new_user_data/metrigo_prod-new_user_data-tmplink-ka-705732-Data.db at org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:243) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:146) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:75) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) ~[apache-cassandra-2.1.2.jar:2.1.2] at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:232) ~[apache-cassandra-2.1.2.jar:2.1.2] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_45] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Obviously that error aborted the compaction and after some time the number of pending compactions became very high on every node. Of course, this in turn had a negative impact on several other metrics.
After reverting the setting we had to restart all nodes. After that compactions could finish again and the pending compactions could be worked off.