Data Management features include creating and maintaining the data sources and datasets. Jump to a topic by following a link.
- Creating a Data Source
- Searching Data Sources
- Uploading a Data Source File
- Deleting a Data Source
- Creating a Folder
- Cutting and Pasting Files
- Creating a Dataset
- Sharing a Dataset
- Deleting a Dataset
Data source is a configuration explaining from where to pull the data and how much data to pull for the model execution. The jobs user runs in PRL use data that derive from data sources. A data source can be used as input data source or output data source by a job. Predictive learning enables data scientist to consume data from multiple data storages by providing below data sources:
- Integrated Data Lake (IDL)
- A reference to the folders in IDL helps models to access the data to read and write to IDL.
- Internet of Things (IoT)
- A configuration to read the timeseries data from an Asset.
- Predictive Learning Storage
- Models can refere to larger Files e.g. parquet files by uploading them to this internal storage of PrL.
- Data Exchange
- Model can refer to folder structures and smaller files with size less than 100MB using this internal storage of PrL
Example of Data Sources Page¶
Here is an example of the Data Sources Page that illustrates some of the actions you can take with data sources:
How to Create a Data Source¶
Follow these steps to create a data source:
- Click "Add Data Source" in the Data Management section.
- Select a data source location, and click "Next".
- Enter a name for the data source.
- For IoT, select the number of hours, and click "browse" to select an asset and aspect or, for IDL, select the folder and file or, for PrL storage, enter the path to your PrL storage account.
- Click "Save".
Actions Available for All Data Sources¶
All data sources allow users to:
Search for a data source—enter a data source name in the search bar at the top of the Data Sources table . Only data source names can be searched. As you enter search characters, the UI displays the Data Source locations that contain files with matching names. You must click the Data Source location to expand it and view the actual files.
Actions Available for Data Exchange Data Sources¶
Only Data Exchange sources allow users to:
Create a folder—click the ellipses in the personal or shared data exchange row, and select "Create folder". The folder is added to the data exchange source you select. Currently folders can only be added to data exchange data sources.
Upload a data source file—click the ellipses the row of the Data Source where you want the file to be added, and select "Upload file". Once you upload a file to PrL, you must open the folder to search the file contents.
- It is only possible to upload CSV and JSON files.
- Maximum size of the file should be less than 50 MB.
- Users can upload only one file at a time.
- File upload is subject to tenant data exchange storage quota. Once the tenant reaches the quota limit, please upgrade the plan.
Cut and paste files—only files can be cut and pasted, not folders. While there is no limit to file size for cutting and pasting, only one file at a time is supported.
Follow these steps to cut and paste a file:
- Open a Data Exchange folder to display its files.
- Click the ellipses in the row of the file you want to cut.
- Select "Cut". The "Paste" action is enabled once you have "Cut" a file.
- Click the ellipses in the row of the folder where you want to paste the file and select "Paste".
- Click the folder to view the pasted file.
Download files—Prl also allows the users to download the uploaded files.
Dataset is a static set of data pulled from IoT for a particular asset for a particular time interval. Datasets usually used in Model Development.
The Datasets page displays a table of the datasets you have access to, and a right side "Overview" panel which shows summary information about all of your datasets.
When you select a dataset in the table, the right side panel changes to "Details" and shows facts about the selected dataset.
Here is an example of the Datasets page, showing Overview information in the right side panel:
Actions Available on the Datasets Page¶
Here are some of the things you can do on the Datasets page:
- View dataset details by clicking a row in the dataset table. Details appear on the right side panel.
- Create a new dataset by clicking the "Create a Dataset" link.
- Refresh a dataset contents by clicking the refresh icon.
- Share a dataset by selecting the share icon from the ellipses menu in the row of the dataset you can share your dateset with the other PrL users of the tenant. Dataset shared to you cannot be reshared.
- Delete a dataset by selecting the delete icon from the ellipses menu in the row of the dataset you want to delete.
How to Create a Dataset¶
Datasets used in PrL include aspect data coming from one of your assets over a time range you specify.
Follow these steps to create a dataset:
- Select "Browse Datasets" on the PrL landing page, then click "Add a Dataset"; or click the "Add Dataset" link from the Quick Actions area. The "Add an IoT Dataset" page displays.
- Click anywhere in the date range field and select "start" and "end dates" from the calendar pop-up window.
Users can set a maximum of 90 days time range.
- Click "Browse" in the Inputs section.
- Select an asset and aspect.
- Click "Proceed". The "General" section displays.
- Enter a name for the new dataset and description (optional).
- Click "Create a Dataset". The Datasets page displays the new dataset at the top of the table, with a status of "Running".
Once the dataset is created, the status changes to "Succeeded" or "Error" if there was a problem with the process.
Any questions left?
Except where otherwise noted, content on this site is licensed under the MindSphere Development License Agreement.