Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
master, 3.7.5, 3.8.2
-
None
Description
I encountered this on production:
"java.lang.OutOfMemoryError: Java heap space\n\t at java.base/java.util.Arrays.copyOf(Unknown Source)\n\t at java.base/java.util.ArrayList.grow(Unknown Source)\n\t at java.base/java.util.ArrayList.grow(Unknown Source)\n\t at java.base/java.util.ArrayList.add(Unknown Source)\n\t at java.base/java.util.ArrayList.add(Unknown Source)\n\t at org.apache.james.mailbox.model.MessageRange.split(MessageRange.java:247)\n\t at org.apache.james.mailbox.store.MessageBatcher.batchMessagesReactive(MessageBatcher.java:70)\n\t at org.apache.james.mailbox.store.StoreMailboxManager.lambda$copyMessagesReactive$48(StoreMailboxManager.java:713)\n\t at org.apache.james.mailbox.store.StoreMailboxManager$$Lambda/0x00007f12613caab8.apply(Unknown Source)\n\t at reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain.onNext(MonoFlatMapMany.java:163)\n\t at reactor.core.publisher.MonoZip$ZipCoordinator.signal(MonoZip.java:297)\n\t at reactor.core.publisher.MonoZip$ZipInner.onNext(MonoZip.java:478)\n\t at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)\n\t at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:74)\n\t at reactor.core.publisher.MonoZip$ZipCoordinator.signal(MonoZip.java:297)\n\t at reactor.core.publisher.MonoZip$ZipInner.onNext(MonoZip.java:478)\n\t at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245)\n\t at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305)\n\t at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onNext(FluxMapFuseable.java:129)\n\t at reactor.core.publisher.Operators$ScalarSubscription.request(Operators.java:2571)\n\t at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.request(FluxMapFuseable.java:171)\n\t at reactor.core.publisher.MonoFlatMap$FlatMapInner.onSubscribe(MonoFlatMap.java:291)\n\t at reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber.onSubscribe(FluxMapFuseable.java:96)\n\t at reactor.core.publisher.MonoJust.subscribe(MonoJust.java:55)\n\t at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:76)\n\t at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165)\n\t at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:79)\n\t at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:122)\n\t at reactor.core.publisher.MonoPublishOn$PublishOnSubscriber.run(MonoPublishOn.java:181)\n\t at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)\n\t at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)\n\t at java.base/java.util.concurrent.FutureTask.run(Unknown Source)\n"
Was able to reproduce: CF screenshot
This was actually encountered with the following batchSizes:
copy=10 move=10
And increasing aggressively the batch size was actually usefull as a work around:
copy=2000000000 move=2000000000
However I fear this means the overall batching process for MOVE and COPY makes little sense...
I do think this could be handle in a pure reactive way:
- Fetch all the messages in the range
- window them using the batch size
- perform the update one window at a time
- and finally aggregate the resulting MessageRange
I will try to get a shot at it later this week.
BTW do my great unpleasure it was not possible to disable batching...
Caused by: java.lang.IllegalArgumentException: 'copyBatchSize' must be greater than zero
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
at org.apache.james.mailbox.store.BatchSizes$Builder.copyBatchSize(BatchSizes.java:86)
at org.apache.james.modules.mailbox.CassandraSessionModule.getBatchSizesConfiguration(CassandraSessionModule.java:109)