Filtering a Dataset¶
Filtering the data in a Workspace gives you an opportunity to refine the dataset before saving it.
See Filtering Workspace Data for information on using the query builder.
Filter Transformation Panel Illustration¶
This image shows the Filter Transformation panel when it first opens:
Define the Filter Tab Illustration¶
This image shows the Define the Filter tab on the Filter Dataset panel:
Columns to Carry Over Tab Illustration¶
This image shows the Columns to Carryover tab on the Filter Dataset panel:
How to Filter a Dataset¶
Follow these steps to define a workspace dataset filter:
- Navigate to the Manage Workspace page and start a cluster. When the cluster is up and running, the status changes to Running and the cluster button changes to Stop Cluster.
- Open an existing workspace or create a new one. The workspace opens.
- Click the + Add Transformation button. The Select a Transformation dialog box opens.
- Select Filter and click the Select button. A Filter Dataset panel opens.
- Click the Select button and select a dataset from the drop-down list. The dataset loads in the panel.
- Use the query builder on the Define the Filter tab to select the parameters for the dataset filter. See the Filtering Workspace Data topic for information on using the interface.
- Click the Columns to Carryover tab and select the columns to include in the filter. The columns will be ordered by their index in the list. Duplicate names will be adjusted to avoid conflicts.
- Click the Run button at the top of the Filter Dataset panel. A message appears that the transformation is running. When it has completed, the Run button changes to a Save as Dataset button.
- Click the Save as Dataset button to save the dataset with the filter. The name defaults to the dataset name but can be changed.