[FLINK-34566] Flink Kubernetes Operator reconciliation parallelism setting not work - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: kubernetes-operator-1.7.0
Fix Version/s: kubernetes-operator-1.8.0
Component/s: Kubernetes Operator
Labels:
- pull-request-available

Flags:

Important

Description

After we upgrade JOSDK to version 4.4.2 from version 4.3.0 in ~~FLINK-33005~~ , we can not enlarge reconciliation parallelism , and the maximum reconciliation parallelism was only 10. This results FlinkDeployment and SessionJob 's reconciliation delay about 10-30 seconds when we have a large scale flink session cluster and session jobs in k8s cluster。

After investigating and validating, I found the reason is the logic for reconciliation thread pool creation in JOSDK has changed significantly between this two version.

v4.3.0:
reconciliation thread pool was created as a FixedThreadPool ( maximumPoolSize was same as corePoolSize), so we pass the reconciliation thread and get a thread pool that matches our expectations.

https://github.com/operator-framework/java-operator-sdk/blob/v4.3.0/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ConfigurationServiceOverrider.java#L198

but in v4.2.0:

the reconciliation thread pool was created as a customer executor which we can pass corePoolSize and maximumPoolSize to create this thread pool.The problem is that we only set the maximumPoolSize of the thread pool, while, the corePoolSize of the thread pool is defaulted to 10. This causes thread pool size was only 10 and majority of events would be placed in the workQueue for a while.

https://github.com/operator-framework/java-operator-sdk/blob/v4.4.2/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/config/ExecutorServiceManager.java#L37

the solution is also simple, we can create and pass thread pool in flink kubernetes operator so that we can control the reconciliation thread pool directly, such as:

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

image-2024-03-04-10-58-37-679.png
04/Mar/24 02:58
69 kB
Fei Feng
image-2024-03-04-11-17-22-877.png
04/Mar/24 03:17
103 kB
Fei Feng
image-2024-03-04-11-31-44-451.png
04/Mar/24 03:31
58 kB
Fei Feng

Issue Links

links to

GitHub Pull Request #790

Activity

People

Assignee:: Fei Feng

Reporter:: Fei Feng

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 04/Mar/24 03:23

Updated:: 08/Mar/24 10:37

Resolved:: 08/Mar/24 10:37