IoT Time Series Service¶
Idea¶
The IoT Time Series Service is used to create, read, update, and delete time series data. Time series data is stored against an asset and an aspect. A time series record consists of a timestamp, one or more values, and an optional quality indicator for each variable, which is defined by the aspect type. Within the IoT Time Series Service you can store and query time series data with a precision of 1 millisecond.
Access¶
For accessing this service you need to have the respective roles listed in IoT Time Series Service roles and scopes.
A user can only read data within their environment, subtenants and shared assets.
For accessing the Secure Data Sharing (SDS) protected APIs you need to have appropriate Policy Definitions in place. Please refer here for the list of supported APIs and Required Actions.
Prerequisites¶
Ingesting Time Series Data from a Field Device¶
An onboarded asset must be connected to Insights Hub and produce time series data. A valid mapping must exist from device data points to asset variables. For instructions refer to Uploading agent data.
Ingesting Time Series Data from an Application¶
An asset with respective variables must exist.
Basics¶
A time series record consists of a timestamp, one or more values, and an optional quality indicator for each value, as defined in the aspect type definition. If there are multiple variables in the aspect, they are expected to arrive in the same payload. Writing a time series record with the same timestamp, asset and aspect as an existing one completely overwrites the old record. There is no versioning of data.
The timestamp may be specified with millisecond precision. Measurement values must match the type that is defined in the aspect type. Time series data can be of type int
, long
, double
, boolean
, string
, big_string
(blobs), or timestamp
. The maximum sizes of strings and big strings are defined in the aspect type, and can be up to 255 and 100,000 respectively. The maximum size of a time series record is 200,000 byte. Each data type is validated: longs, doubles, and timestamps using eight byte, integers using four byte, strings and big strings using two byte per character, and Booleans using one byte.
In the aspect type definition, there is a qualitycode
field, which indicates whether quality code data will accompany the time series data. If the qualitycode
value is true for a variable (property), the name of the quality code property is the same as the variable (property) name, extended by a _qc
suffix. For example, the quality code property for the variable temperature
would be temperature_qc
.
Multiple records for an asset and aspect can be written in one call to the service. Records can be read and deleted by identifying the asset, aspect, and time range.
When writing or deleting data, the default behavior of the service is to queue the incoming request and perform the physical writes to the underlying data store asynchronously. If a user writes time series data and immediately tries to read it back, it may not yet be present in the data store and it will not be returned. If a user deletes data and then reads that time range, the data may not have been physically deleted yet, and could be returned in the response. Requests for an asset are processed in the order they are received.
Features¶
The IoT Time Series Service exposes its API for realizing the following tasks:
- Read time series data for an asset or aspect
- Read time series data for a specified time range
- Read the latest uploaded data
- Create time series data for a single or multiple aspects
- Overwrite existing time series data
- Delete time series data for a single aspect of an asset within a given time range
Important
It is recommended to use GET Queries with Time range for better throughput and performance.
Usage Quota and Limits¶
The general limits for IoT timeseries is mentioned in API rate limits
Delete Limitations¶
The Iot Time Series DELETE endpoint is restricted by the following limitations:
- Time Series data can only be deleted 7 days after it was submitted.
- The range of a delete request must be UTC hourly aligned.
- The range of a delete request cannot be greater than 366 days.
- The maximum number of delete requests per environment is 1 per hour.
- For the specific timeframes where delete operation is undergoing & data ingestion happens; there might be data inconsistencies. vice versa is also true.
- For Private cloud offering, Delete is asynchronous process and it will take upto 1 day for deleting timeseries data.
Best Practises & Recommendations¶
Data Ingest Recommendations¶
- Use normal PUT endpoint for ingesting near real-time timeseries data. This normal PUT endpoint works for timeseries data corresponding to single asset-aspect combination. Once you ingest timeseries data using this endpoint, data gets persisted within single digit seconds into the persistent layer so that, data can be queried for reporting/dashboarding/analytical purpose.
- Use Multi asset multi aspect PUT endpoint for ingesting near real-time timeseries data when you club timeseries data for multiple asset-aspect combinations. Once you ingest timeseries data using this endpoint, data gets persisted within single digit seconds into the persistent layer so that, data can be queried for reporting/dashboarding/analytical purpose.
- Use bulk ingest endpoint for ingesting historian data or data in bulk which do not have near real-time expectations. Once you ingest timeseries data using this endpoint, data gets persisted within 4 hours into the persistent layer so that, data can be queried for reporting/dashboarding/analytical purpose.
- It's best practice to include error handling for 429 responses in your code. The 429 status code means too many requests. The Retry-After header specifies no. of seconds after which API call can be retried. Your code should stop making additional API requests until enough time has passed to retry.
Data Query Recommendations¶
- Requirement for fetching near real time data for specific time range up to 90 days of data from now. we have 2 API endpoints for it:
- Normal raw data GET endpoint fetches maximum of 2000 records in one call with latency expectation of ~1 second.
- Bulk stream GET endpoint fetches as many records as it can pull within 60 seconds (gateway timeout duration), hence no latency expectation.
- Requirements for fetching data older than 90 days to maximum of 1 year old data, we can use aggregate API endpoint with right input parameter for interval value & interval unit.
Generic Recommendations¶
- For live data ingestion use cases, Time series API works well with IoT Time Series Service. For historical data older than 30 minutes and overall data size > 1 MB, it is recommended to use IoT TS Bulk Services.
- Time series Api limit is 2000 records for a data Query. If demand is higher than 2000 records involving data older than 4 hours, it is recommended to use the IoT TS Bulk Read Services.
- If results of some data queries are to be further utilized for Analytical purpose, IDL Timeseries Import can be leveraged instead of multiple calls of Time Series API.
- Data query performance varies and can be on a higher side with increase in Asset modelling complexity (More number of variables, Big String data types, date range etc.) Modelling recommendations are available at Asset Management Service - Modeling Recommendation.
- Real/near-real time data should be of recent time-range to observe a low data query latency. Queries for older data are subjected to have higher latency.
- It is recommended to use ‘GET’ Time series Queries with a specified time range for better throughput and performance.
- Same, repetitive & similar time range data query should be avoided during continuous polling. i.e., Repetitive Queries for last few months data, sending every minute will hold build huge load on system with only value addition of only last 1 minute data over previous. Such type of redundant queries should be avoided as much as possible.
- Longer time range should be split into smaller time ranges to experience faster query results.
- If data of multiple variables needs to be queried, using a single api call instead of multiple calls is preferable.
- For applications frequently querying multiple days of data, it is recommended to break the time range and do incremental time range based queries.
- For Aggregate queries, consumer applications should provide an option to select Aggregate window (intervalValue & intervalUnit).
- It is recommended to not try either of the operation (put/delete) together for given tenant/asset/aspect/timerange combination unless first operation is completed to avoid any data inconsistencies.
Rate limit and retry Recommendation¶
- It's best practice to include error handling for 429 responses in your code. The 429 status code means too many requests. The Retry-After header specifies no. of seconds after which API call can be retried. Your code should stop making additional API requests until enough time has passed to retry.
Reducing the number of API requests¶
- Optimize your code to eliminate any unnecessary API calls. For example, are some requests getting data items that aren't used in application?
- Cache frequently used data. Application can cache data on the server or on the client. Application can also save relatively static information in a database.
- Sideload related data. Application can avoid unnecessary API calls by sideloading one set of records with another. Sideloading lets you get two sets of records in a single request. For Example, If data for multiple variable of an aspect is requested in separate requests in parallel, it should be sideloaded to single request.
- Use bulk endpoints where ever possible, that let you ingest/read data in single API request. However bulk endpoint may not be performant or sync endpoint, but may benefit in few scenarios to avoid throttling.
Regulating the request rate¶
- If you regularly exceed the rate limit, update logic to distribute requests more evenly over a period of time. This is known as a throttling process or a throttling controller. Regulating the request rate can be done statically or dynamically. For example, monitor your request rate and regulate requests when the rate approaches the rate limit.
For Optimizing Read Rate¶
- Asset modeling recommendations
- Use "select" while data queries timeseries raw or aggregate data. This will reduce the response size as per requirement and hence reduce the Read/Compute requirements. For Aggregate select you may refer this link
- Leverage Timeseries subscription notification for new data arrival feature to get notified when to make data queries instead of continuous polling.
- Application should be aware of throttling and should have an ability to handle such responses by either distributing calls over time or make their end users aware so that they do not make mistakes.
Limitations¶
- The maximum payload size for data upload for endpoint PUT/timeseries/{entityId}/{propertySetName} is 1 MB .
- The Data Ingest rate throttling limit is 100 KB/s for Asset/Aspect combination in PUT Time series API.
- For the new P&P environments, historical data Ingest allowed for past data is up to 366 days and for future data is up to 31 days.
- Double can have up to 38 digits of precision,and can be positive, negative, or zero.
- Positive range for Double : 1E-130 to 9.9999999999999999999999999999999999999E+125
- Negative range for Double : -9.9999999999999999999999999999999999999E+125 to -1E-130
- Requests are processed in order of arrival. Thus, the latest stored record might not reflect the latest request.
- Data Ingest Timeseries precision is reduced to millisecond precision for Timeseries Storage, Query and further data related processing. For Example, any granular precision like nanosecond will be converted and stored in millisecond precision
- Storage of empty strings is not supported. The respective fields are ignored.
- Queries with the URL parameter
latestValue
are not available for String and Timestamp variables. - Queries with the URL parameter
latestValue
do not consider records ingested more than 3 months or less than ~5 minutes ago. For Private Cloud DeploymentslatestValue
parameter does not consider records ingested more than 1 month or less than 2 hours ago. - Queries with the URL parameter
latestValue
brings latest value for near real time data ingestion. If data ingestion is delayed and past data is ingested which is older than ~2 hours; it will have similar delays to fetch latest values. - Queries without specified timestamps and older than 12 months will provide empty response.
- Queries with 'limit' parameters works precisely in case 'select' is not used together. In case of 'select' with 'limit'; the response is further reduced by actual variables available in retrieved limit values. The default limit is 2000 timeseries records.
- Queries with 'limit' parameters works precisely in case there is no empty records (empty records only contains timestamp without any value on any aspect variables) for requested range. In case it contains empty records; the response is further reduced by total number of empty records in retrieved limit values. The default limit is 2000 timeseries records.
- After Asset Model update (i.e. updating asset, aspect,assetType, variables etc.), it may take up to 1 minute for internal sync and processing successful requests for timeseries data ingestion, reads.
- Queries without To & From (which is Get the latest record query) work precisely in case of no empty records (empty records only contains timestamps without any value on any aspect variables). In case it contains empty records as latest data ingestion records; the response will be empty.
- The maximum number of aspects (one or multiple assets) per data upload request is 5, with a maximum payload size of 100 KB and 100 records for endpoint PUT/timeseries.
- The '+' character used to include timezone in timestamp must be encoded to '%2B' in time range. For example, 2017-09-21T14:08:00%2B02:00 instead of 2017-09-21T14:08:00+02:00.
- Time series data will not be retrieved via GET endpoint of Time series services for Data Ingested via IoT TS Bulk Services for past 7 days.
- For Time series Data ingestion of past 7 days, use either IoT Time series Services or IoT TS Bulk Services. If both the methods are used, data consistency will be compromised.
- If customer tries the operation (Data Ingest/Delete) together for given tenant/asset/aspect/timerange combination, there might be data inconsistencies.
- For Private Cloud Deployments,
- Timeseries will not accept data older than (T - 168) hours.
- Timeseries will not accept future data (T + 0) hours.
- Usage metrics in UTS is not available for Timeseries Storage.
- For Timeseries Data Ingestion below special characters are not supported in STRING or BIGSTRING which contains '\u0000'.
- Timeseries data deletion is an asynchronous job and data may be visible from few mins upto few hours after the delete API is called.
To get the current list of known restrictions, go to release notes and choose the latest date. From there go to "MindAccess Developer Plan Subscribers and MindAccess Operator Plan Subscribers" and pick the IoT service you are interested in.
Example Scenario¶
A wind turbine is connected to Industrial IoT. The blades move with a constantly changing speed. A sensor measures the speed and sends the data via MindConnect to Insights Hub.
The IoT Time Series Service API allows you to write and read the speed data of the wind wheel.
Related Links¶
Except where otherwise noted, content on this site is licensed under the Development License Agreement.