IoT Application Development with AWS, Azure, and GCP
Part 2: Provisioning and device management
By Chad Elliott, OCI Principal Software Engineer, and Kevin Stanley, OCI Partner and IIoT Practice Lead
February 2021
Introduction
Internet of Things (IoT) technology has progressed a lot in the last few years to help connect devices via the internet and enable them to send and receive data. It is a relevant concept for any system that must leverage physical devices to perform data collection or to control mechanical devices.
One of the most important aspects of IoT is that it defines a set of standards for connecting these devices, which should simplify their use in applications and avoid lock-in with proprietary applications and components. In reality, we still have a way to go to achieve standardization similar to what we see with browsers connected on the internet.
In our previous installment of the IoT application development articles, we gave an overview of the Industrial Internet Consortium (IIC) reference architecture for IoT and the three major cloud platforms that support their development:
- Amazon Web Services (AWS)
- Microsoft Azure
- Google Cloud Platform (GCP)
For our analysis, we introduced the IoT platform functional categories outlined by the IIC categorized into the following supersets (as shown in Figure 1):
- User Management
- Business Management
- Security and Disaster Management
- Process Management
- Analytics & Data Management
- Provisioning and Device Management
The primary reason we focus on these functional areas is that the vast majority of these functions need to be addressed in a complete IoT solution. It is, thus, helpful if an IoT platform provides support for these capabilities, minimizing the amount of development and maintenance required to satisfy these key functional requirements.
In Part 1, we focused specifically on process management. In this installment we focus on another critical functional aspect of IoT applications, provisioning and device management.
Provisioning and device management includes capabilities for registering and managing a set of devices within a system, as well as discovering resources (e.g., data storage, databases, virtual machines, etc.) within our system and determining device location.
In this article we describe how each of the major platforms supports these capabilities and provide some conclusions and recommendations on which platform(s) might be best suited for your use cases.
Provisioning and Device Management
In any IoT application, you will need to interact with one or more physical devices to collect data and interact with the operational environment. Each cloud platform has a method of representing the physical IoT device within the framework of the cloud platform to enable communications with the device. Each platform provides a user interface (UI) or command line interface (CLI) to help you create and maintain the digital representations such that they match the real world configuration of your system. For a production environment of a medium- to large-scale application, these capabilities need to be performed using managed source code. For each of these capabilities, we will focus on the UI in this article and provide a link to the CLI documentation that corresponds to these functions.
AWS
AWS provides IoT Core as part of the cloud platform dedicated to IoT system development. The CLI is documented in the following location: https://docs.aws.amazon.com/cli/latest/reference/greengrass/.
Within IoT Core, devices are clustered together within a Greengrass Group. The Greengrass Group is a logical container for devices and the associated lambda functions that are used to manage topic subscriptions, lambda associations, device resources, and software deployments to devices. Up to 200 devices can be added to a group; they can be newly provisioned or added as pre-existing devices.
Using the UI, creating a Greengrass Group is as simple as typing a name and clicking a few buttons.
Once you've created your Greengrass Group and Core, you can add devices through the Devices link on the group page. Click the Add Device button to create a device or add an existing device that will be attached to the group.
You provide a device name and AWS can do the rest. It will generate a certificate, public key and private key using AWS IoT’s root CA, generate a default policy, and create a new IAM role with default permissions. The certificate, private key, and root CA can then be supplied to the physical device in some manner, in order for the device to provide credentials when communicating through the MQTT gateway (see the previous article for information on MQTT).
When adding devices, keep in mind that the device name must be unique within the region. Greengrass groups are isolated by region and thus will allow the same device names in different regions. However, devices within different groups in the same region cannot have the same name. This is due, in part, to the fact that each device has built-in topic names associated with it within the MQTT broker that include the device name. There is a single MQTT broker per region within a single account.
Once the device has been created, the name cannot be changed. But, all other aspects of its configuration can be. Thing types, searchable attributes, security certificates, Thing groups, billing groups, and shadow documents can all be modified.
When you select a device from either the Greengrass Group or the Thing management page, you will see something like the image below that will allow you to update your device.
In the event that a device has ceased to function or has outlived its usefulness, it can be removed from the system. Devices can be removed from a group but will still exist and can be added to other groups. They can also be removed completely by detaching them from the group and then clicking on the Manage link to completely remove the “thing” that represents the device. Click the three horizontal dots in the top right corner of the device that you want to remove and then click Delete.
Azure
The CLI for provisioning within Azure IoT is documented in the following locations:
- https://docs.microsoft.com/en-us/cli/azure/ext/azure-iot/iot/dps?view=azure-cli-latest
- https://docs.microsoft.com/en-us/cli/azure/azure-cli-reference-for-iot
Of particular relevance to provisioning and device management are the digital twin (dt), device provisioning service (dps), and iot hub extensions.
The IoT Hub is the name for the centralized location for managing your IoT devices in Azure. You can have multiple IoT Hubs, and the name of each hub must be unique within the entire Azure system, not just your account or region. You choose the subscription, resource group, and region and provide the hub name.
An Azure subscription is a logical container used to provision resources in Azure. It holds the details of all your resources like virtual machines (VMs), databases, and more.
Once those are chosen, move on to the next page to select the size and scale.
This screen shows the pricing and scalability of the IoT Hub, including the number of allowed messages per day and IoT Edge and device management availability. The message allowance is defined in the subscription for this IoT Hub.
Once the IoT Hub is created, you must add a certificate authority; it can be in either .pem or .cer format. You can purchase a root CA or create your own using openssl for testing/development. Either way, your IoT devices will need to use the associated root CA when connecting to the MQTT broker.
Creating an IoT device is as simple as providing a name and certificate thumbprints.
Once your device has been created and has been given credentials, it will be allowed to communicate through the IoT Hub.
GCP
The CLI for GCP IoT Core is documented in the following location:
https://cloud.google.com/sdk/gcloud/reference/iot
Within GCP, IoT devices are grouped together by a registry. A registry must be created prior to creating devices within the platform. Registries are created under IoT Core and must have a unique name within a single geographic region, i.e., asia-east1, us-central1, etc.
Once the registry is created, you can add devices to it.
When creating a device, you must, at a minimum, provide a device ID, which must be unique within the registry. The authentication section is optional, but without providing a public key, your device will not be able to connect to the Google Cloud.
You can provide the public key during or after device creation. Unlike AWS, you must manually create the public and private keys for your device using the openssl command or something similar; instructions on how to do this can be found in the Google Cloud IoT documentation.
Like AWS, once the device has been created on the platform, the name cannot be changed. But, all other aspects of it can be. The device metadata, communication ability, logging, configuration, and public keys can all be modified as shown below.
When the need arises, devices can also be removed from a registry. Devices can be removed individually or as a group. Select one or more devices from the registry and select the DELETE button.
Resource Discovery
In an IoT platform, you will enable control and integration of the IoT devices using a set of resources within the cloud platform. This includes things like buckets for storing data received from the IoT devices, databases for efficiently organizing this data, and functional components that support the processing of the overall IoT system.
The need for locating and retrieving resources dynamically becomes more important as either the number of IoT devices deployed or the number of resources in use increases. Each cloud platform provides interfaces for locating resources based on a set of search criteria. Using these interfaces, you can create more resilient and dynamic systems composed of groups of devices and services built around the managed devices and the environment to which they are deployed.
AWS
Resource Discovery within AWS takes the form of a Cloud Map. It is designed to allow you to create names within namespaces that refer to resources, such as databases, lambdas, microservices, etc. Referring to these services by the Cloud Map registered name allows those resources to be swapped out for other resources without having to change code or configuration to refer explicitly to the new resources.
Begin by creating a namespace, which will be the top level for a group of one or more services. The Instance discovery section indicates how your applications will locate the registered instances of services within this namespace. API calls can be used to locate resources by name for all namespaces, and optionally, DNS queries may be used as well.
Once a namespace has been created, you can create a service within it, which can contain an optional health check to assist in identifying issues with the service.
Once the service has been created, you can register one or more instances of the service.
In our example, we’re creating a service instance that will be accessed via MQTT. For this example we provide a topic name as a custom attribute. This is not a required step, but it does allow us to define a service when we don't have an IP address for it.
Later, we’ll see how that custom attribute can be obtained and used for sending our MQTT message. If we had the need, we could provide multiple ways of accessing the service via additional custom attributes.
Now that our instance has been registered, we can access the service via the service discovery client using the namespace and service name as reference. The python code below shows a simplified use of the AWS SDK to get an instance of our CameraStatus service and send an MQTT message to it.
## Find an instance of our service
discovery = boto3.client('servicediscovery')
response = discovery.discover_instances(NamespaceName='Primary',
ServiceName='CameraStatus',
HealthStatus='HEALTHY')
instance = response['Instances'][0]
## Send a status message to the topic
mqtt = boto3.client('iot-data')
mqtt.publish(topic=instance['Attributes']['TOPIC'],
payload=json.dumps({'request': 'status'}))
Azure
Azure provides a Resource Graph Explorer that can be used to search for and display available resources. Using the portal interface, we can search the relevant resources and find instances of a specific type.
It also provides a Resource Management REST API to allow the dynamic discovery of resources within a subscription. Using the Python SDK, azure-mgmt-resource, we can perform a similar query as performed in the portal interface.
import os
from azure.identity import DefaultAzureCredential
from azure.mgmt.resource import ResourceManagementClient
## Get credentials
credentials = DefaultAzureCredential()
subscription_id = os.environ.get('AZURE_SUBSCRIPTION_ID')
## Get a list of resources
client = ResourceManagementClient(credentials, subscription_id)
resources = client.resources.list()
## Get just the IoT Hubs, for instance
hubs = [item for item in resources if item.type == 'Microsoft.Devices/IotHubs']
This code could be used as a template to find other types of resources in order to implement a QoS mechanism, some type of scheduling algorithm, or even a composite service of dynamically discovered resources.
An example of a composite service would be a service that dynamically locates a REST API and an SQL instance, makes a request, and places a derived value from the response into the SQL instance.
GCP
Google provides the Cloud Asset API to allow discovery of resources within a project or organization. Information about resources and the metadata associated with them is automatically maintained over a five week period. That information can be searched or exported via the REST API or the gcloud CLI.
The types of resources that can be searched is limited and does not contain all of the types that can be exported (see https://cloud.google.com/asset-inventory/docs/supported-asset-types for a complete list). Among these non-searchable resource types are Cloud Functions, Cloud Storage, Pub/Sub, Cloud SQL, and a few others. Discovery of these types can be achieved by exporting the resource data to a file in Google Storage or BigQuery. However, this data set is limited; it is just a snapshot at the time of export.
$ gcloud asset export --project=assemblyline --bigquery-table=20201125 \
--bigquery-dataset=asset_export
This command will create a table within the asset_export
dataset. It can take a number of minutes to complete and the destination table must not exist prior to running the command.
In addition to search and export, you can also set up monitoring to receive notification of changes in assets. To configure monitoring, you must first create an asset feed.
$ gcloud asset feeds create cloud_assets --project=assemblyline \
--pubsub-topic=projects/assemblyline/topics/asset_changes \
--asset-types=cloudfunctions.googleapis.com/CloudFunction
assetTypes:
- cloudfunctions.googleapis.com/CloudFunction
condition: {}
feedOutputConfig:
pubsubDestination:
topic: projects/assemblyline/topics/asset_changes
name: projects/1044125457779/feeds/cloud_assets
With the above command, when something related to cloud functions changes, a message will be sent to the Pub/Sub topic. Using the GCP Pub/Sub console, we can see that these messages contain information such as asset type, function name, update time, etc.
IAM policies for specific types of assets (see https://cloud.google.com/asset-inventory/docs/supported-asset-types#analyzable_asset_types for a list of analyzable asset types) can also be analyzed through the Cloud Asset API. The analysis provides insights on what can be done with certain resources and by whom. For instance, by using the following gcloud command, we can see which users can start compute instances.
$ gcloud asset analyze-iam-policy --project=assemblyline \
--permissions=compute.instances.start
---
ACLs:
- accesses:
- permission: compute.instances.start
identities:
- name: user:person1@objectcomputing.com
- name: user:person2@objectcomputing.com
resources:
- fullResourceName: //cloudresourcemanager.googleapis.com/projects/assemblyline
policy:
attachedResource: //cloudresourcemanager.googleapis.com/projects/assemblyline
binding:
members:
- user:person1@objectcomputing.com
- user:person2@objectcomputing.com
role: roles/owner
Your analysis request is fully explored. The ACLs matching your requests are listed per IAM policy binding, so there could be duplications.
Location Services
IoT devices have varying levels of hardware capabilities. Some devices will have a GPS receiver and some will not. If your device has that functionality and has unshielded access to orbiting satellites, providing device location is as simple as reading the GPS coordinates and either sending them periodically or upon demand through some data transmission protocol, such as MQTT. Otherwise, there are other methods available that will allow your device to provide its location, with varying degrees of accuracy.
An approximate device location may also be deduced using the device's IP address. Both GCP and Azure provide geolocation based on IP address through their respective map APIs. This GCP example shows the SDK usage that will obtain the GPS coordinates of the configured IP address of the device.
import googlemaps
gmaps = googlemaps.Client(key=API_KEY)
resp = gmaps.geolocate(consider_ip=True)
lat, lng = resp['location']['lat'], resp['location']['lng']
If the cloud vendor you're using doesn't provide geolocation based on IP address, there are many services available to do just that. The following snippet shows how to obtain the GPS coordinates associated with an IP address using ipapi.
import requests
url = f'http://api.ipapi.com/api/{ipaddress}?access_key={API_KEY}'
resp = requests.get(url)
data = resp.json()
lat, lng = data['latitude'], data['longitude']
Device location can also be static, and location information could be stored as part of the metadata associated with the device within the application you are developing.
If a device is communicating via a cellular service, it is possible to obtain the coordinates of the cell tower to which the device is connected. Given the cell tower ID and the area code of the tower, we can use the Google Maps API to geolocate the tower and use that as a rough estimate of the device location.
import googlemaps
towers = [
{
"cellId": cell_id,
"locationAreaCode": area_code,
}
]
gmaps = googlemaps.Client(key=API_KEY)
resp = gmaps.geolocate(cell_towers=towers)
lat, lng = resp['location']['lat'], resp['location']['lng']
The accuracy of the returned coordinates pertaining to the location of the IoT device will not be 100%, but they will be relatively close.
In AWS, the thing attributes can be used to provide location information as shown below.
Similarly, GCP gives the option of providing optional metadata when creating devices.
Azure allows you to add arbitrary data to using what they call device twins, which can include metadata, configuration, and device state. Device twins are implicitly created when a device identity is created within your IoT Hub and are similarly deleted along with the device identity.
We can use the following python code to get the twin data and pull out the information we need.
from azure.iot.hub import IoTHubRegistryManager
iothub = IoTHubRegistryManager(iothub_connection_str)
twin = iothub.get_twin('assemblyLine1')
location = twin.__dict__['tags']['location']
Conclusion
Extensiveness
Each of the three cloud providers includes the essential IoT capabilities for provisioning and resource management. Each platform provides both a user interface and command line interface in support of its core IoT platform, however AWS and Azure provide more flexibility and configurability in their IoT platforms with an extensive set of CLI commands for configuring devices, metadata, and associated resources.
Maintainability
AWS and Azure appear to have more comprehensive SDKs for resource discovery, which can reduce complexity in your own code base, allowing for more effective system abstraction, reusability, and flexibility.
Usability/Features
AWS Cloud Map provides tremendous flexibility in abstracting the physical devices into symbolic names, allowing users to swap out resources without any code or script modifications.
While the Azure UI is extremely useful in determining the provisioned devices and resource with a queryable interface that will show only the selected attributes, all of the same information is available through an SDK API.
GCP's strength is in its location services API, which is the most extensive, providing coordinates obtained from multiple sources, including WiFi access points, cell towers, and IP addresses.
Standardization
In our previous article we covered communication management, which includes the use of standards for interoperability in communications. Standardized communications protocols are an essential aspect of interoperability and portability, but they are not sufficient.
For the capabilities described in this article, very few aspects are supported by standards to simplify reuse or integration between the platforms. Each platform provides its own interfaces for defining devices, groups, and supporting resources around developed IoT systems.
For this reason, each of the big three cloud providers leave it up to the developer to produce device integration capabilities to simplify this platform independence, including defining device metadata or resource descriptors.
Edge Deployment
Recent trends in IoT are focusing more on hybrid architectures including support for cloud computing along with edge computing. As IoT devices become more capable, it is possible to deploy some processing to the edge that was previously only possible in the cloud.
Both AWS and Azure provide robust support for deploying and maintaining computational resources on edge devices. Although edge processing was not covered in depth in this article, this is an important capability that enables increased privacy, performance, and reduced latency in IoT applications, so it is an important consideration in choosing an IoT platform.
We will dive deeper into the edge-deployment capabilities of the cloud IoT platforms in a subsequent article.
Software Engineering Tech Trends (SETT) is a regular publication featuring emerging trends in software engineering.