Managing Integrated Data Lake data sources¶
If Integrated Data Lake is provisioned for your tenant, you can subscribe to files from there. The data source will automatically synchronize upon every file update.
To create an Integrated Data Lake data source, follow these steps:
- Select either a file or an entire directory by browsing Integrated Data Lake.
Note
- In the search field, you can search in the list of files that are directly located in the selected directory.
- For CSV files, the wizard tries to infer the delimiter, the date style and the column data types automatically. On the right-hand, you can review the changes, if necessary.
- For a selected directory, the synchronization will include all the parquet files within that directory and its sub-directories. These files may not have different data types for columns with the same name.
2.Click "Next" to proceed to the "Save" step.
Note
You can specify the name of the data source, tags can be assigned and a project can be selected.
3.Click "Finish" to save the data source.
Supported files and file sizes¶
A total of 20 MB data is supported per Integrated Data Lake data source. Each tenant can subscribe to a maximum of 10 data sources. For CSV files, the supported encodings are UTF-8, UTF-16-LE and UTF-16-BE.
Date style¶
Almost all date and time formats are supported. The order of the year, month and day may be ambiguous for some dates and formats. This order is defined by the date style.
It applies to all date or timestamp columns throughout the file. Different date styles within the same file are not supported.