Skip to content

Starting a Cluster

Predictive Learning can involve a great deal of resource consumption. To maximize your usage and contain costs, starting the right cluster type is the most important step. Starting a cluster creates a new environment.

Select a cluster type that is large enough and has the processing power to run your dataset analysis without timing out. On the other hand, a 25-node cluster with 750 GB RAM is larger than necessary to process a moderately sized dataset.

It is also important to turn off pop-up blockers in your browser so all buttons function correctly. It can take several minutes for a newly started cluster to reach 'Running' status.

Cluster Types

The information in this table describes the available cluster configuration types and their recommended uses.

Cluster type... Node type... Best suited for...
Spark EMR + 5 nodes (80 Cores, 320GB RAM)
Spark EMR + 10 nodes (160 Cores, 640GB RAM)
Spark EMR + 25 nodes (400 Cores, 1600GB RAM) )
General General purpose tasks
Spark EMR + 2 nodes (32 Cores, 244GB RAM)
Spark EMR + 5 nodes (80 Cores, 610GB RAM)
Spark EMR + 10 nodes (160 Cores, 1220GB RAM)
Spark EMR + 25 nodes (400 Cores, 3050GB RAM))
Memory optimized Memory intensive tasks
Spark EMR + 2 node (12 Cores, 24 GB RAM)
Spark EMR + 2 nodes (32 Cores, 60GB RAM)
Spark EMR + 5 nodes (80 Cores, 150GB RAM)
Spark EMR + 10 nodes (160 Cores, 300GB RAM)
Spark EMR + 25 nodes (400 Cores, 750GB RAM)
Computer optimized CPU intensive tasks
GPU EMR (1 GPU, 12 GPU memory, 4 Cores, 61GB RAM)
GPU EMR (1 GPU, 16 GPU memory, 8 Cores, 61GB RAM)
GPU EMR (4 GPU, 64 GPU memory, 32 Cores, 244GB RAM)
GPU EMR (8 GPU, 128 GPU memory, 64 Cores, 488GB RAM)
GPU and AI optimized Tasks that require GPU

If You Run Out of Credits

If you run out of credits while a cluster is running, the cluster will continue to run until it is shut down (manually or automatically. However, if you do not have enough credits for at least one hour, the system does not allow you to start a cluster, and displays the following message, which includes information on your remaining credits, and the number of credits the cluster configuration you want to run requires:

Predictive workspace

To purchase more credits, contact your system administrator.

How to Start a Cluster

Follow these steps to start a cluster:

  1. Select Manage Analytics Workspaces from the Analytics Workspace menu. The Manage Analytics Workspaces page opens.
  2. Select the cluster type most suited to the size of the dataset you are processing from the Cluster Type drop-down list.
  3. Click Start Cluster. The Cluster Status changes to "Starting". When the cluster is running, the status changes to "Running".

Stopping Clusters

Once your analysis is complete, you can manually stop the cluster. If the cluster was configured for auto-shutdown when you started it, it will run through the shutdown time, then automatically terminate.

Clusters cost money as long as they are running, but even if you don't define an auto-shutdown time, the cluster will stop running after two hours by default. If a job is in progress when the cluster shuts down, manually or automatically, the job terminates at the time the cluster shuts down.

How to Stop a Cluster

Follow these steps to stop a cluster:

  1. Select Manage Analytics Workspaces from the Analytics Workspace menu. The Manage Analytics Workspaces page opens.
  2. Click the Stop Cluster button. When the cluster stops, the cluster status message changes to "Stopped". This may take a few minutes.

Predictive Learning Essentials

The Essentials package provides a single, non-clustered instance with Jupyter Notebook;.

When you start Jupyter, the Open Jupyter button activates when a cluster is running.

start Jupyter


Last update: January 22, 2024