Details
-
Improvement
-
Status: Resolved
-
P2
-
Resolution: Fixed
-
None
-
None
Description
Improve the workflow of cluster management.
There is an option to configure a default cluster name. The existing user flows are:
- Use the default cluster name to create a new cluster if none is in use;
- Reuse a created cluster that has the default cluster name;
- If the default cluster name is configured to a new value, re-apply 1 and 2.
A better solution is to
- Create a new cluster implicitly if there is none or explicitly if the user wants one with specific provisioning;
- Always default to using the last created cluster.
The reasons are:
- Cluster name is meaningless to the user when a cluster is just a medium to run OSS runners (as applications) such as Flink or Spark. The cluster could also be running anywhere (on GCP) such as Dataproc, k8s, or even Dataflow itself.
- Clusters should be uniquely identified, thus should always have a distinct name. Clusters are managed (created/reused/deleted) behind the scenes by the notebook runtime when the user doesn’t explicitly do so (the capability to explicitly manage clusters is still available). Reusing the same default cluster name is risky when a cluster is deleted by one notebook runtime while another cluster with the same name is created by a different notebook runtime.
- Provide the capability for the user to explicitly provision a cluster.
Current implementation provisions each cluster at the location specified by GoogleCloudOptions using 3 worker nodes. There is no explicit API to configure the number or shape of workers.
We could use the WorkerOptions to allow customers to explicitly provision a cluster and expose an explicit API (with UX in notebook extension) for customers to change the size of a cluster connected with their notebook (until we have an auto scaling solution with Dataproc for Flink).
Attachments
Issue Links
- is a parent of
-
BEAM-14330 google.api_core.exceptions.MethodNotImplemented when tests run in parallel
- Resolved
- links to