This directory contains a sample of a real-time stream processing architecture for IoT using Google Cloud Platform. It is a simplified version of the Solution Architecture for Real-Time Stream Processing for IoT that is using Google Cloud Functions instead of Google Cloud Dataflow to process the messages on the Google Cloud Pub/Sub topic. Once the messages are processed they are inserted into a table stored in Google Cloud Bigtable. Finally the sample shows how to create a table in Google Cloud BigQuery using the table stored in Cloud BigTable as a federated data source.
The IoT device used on this example is a Raspberry Pi with a Rainbow Hat that is used to measure temperature and pressure. It is using the Pimoroni Python driver for Rainbow HAT.
- Google Cloud Platform
- Create Pub/Sub topic
- Create the Bigtable instance
- Create the table
- Create the Cloud Function
- Client
- Create federated table in BigQuery
- Next Steps
The data gathered from the IoT devices is going to be ingested and stored in Google Cloud Platform. The following steps explain how to set up all the components involved in this process.
This sample assumes that you already have a Google Cloud Platform project created. Check the documentation if you need help creating the project.
Execute the gcloud init command to initialize or reinitialize gcloud. This starts an interactive workflow that authorizes gcloud and other SDK tools to access Google Cloud Platform using your user account credentials, and sets properties in a gcloud configuration, including the current project and the default Google Compute Engine region and zone.
If you are using Cloud Shell you can skip this step.
gcloud init
Once you have your project set up and you are authorized to access Google Cloud Platform, use the gcloud to create the pubsub topic, by executing the following command, replacing [MY-IOT-PROJECT] with your project ID:
gcloud beta pubsub topics create <MY-TOPIC> --project <MY-IOT-PROJECT>
Follow the steps described in Creating a Cloud Bigtable Instance to create a Bigtable instance. For this sample you can use a DEVELOPMENT instance.
Configure cbt to use your project and instance by modifying the .cbtrc file, replacing [MY-IOT-PROJECT] with your project ID, and [my-iot-instance] with your instance ID (used on step 3):
echo project = [MY-IOT-PROJECT] > ~/.cbtrc
echo instance = [my-iot-instance] >> ~/.cbtrc
Execute the "cbt createtable" command to create a table called data in your new Cloud Bigtable instance.
cbt createtable data
Finally execute the following commands to create a column family on the "data" table. The column family "data" is going to be used to store the temperature and pressure.
cbt createfamily data data
NOTE: You might need to install the cbt command.
gcloud components update
gcloud components install cbt
The code of the Cloud Function that is going to be used to process the pub/sub messages is in the index.js file, and all its dependencies are defined in package.json.
Clone or Download both files to your local directory (this can also be Cloud Shell), and edit the ZONE and INSTANCE on lines 16 and 17 to match the configuration of your Cloud Bigtable instance.
const INSTANCE = "my-iot-instance";
const ZONE = "us-central1-c";
Once you have your function ready to be deployed, execute the following command to deploy it on Google Cloud Platform using Google Cloud Pub/Sub as the trigger, replacing [MY-STAGE-BUCKET] with the name of the Google Cloud Storage bucket (follow the steps described here if you need to create one) that is going to be used for staging, and [MY-TOPIC] with the name of the topic used on step 1:
gcloud beta functions deploy processMessage --stage-bucket <MY-STAGE-BUCKET> --trigger-topic <MY-TOPIC>
A pubsub client, running on the Raspberry Pi is responsible for publishing the telemetry obtained from the Rainbow HAT, since the Rainbow HAT already offers a python library to do this, the sample pubsub client is written in python as well. This client is measuring the temperature and pressure every half a second, displaying the temperature on the 14-segment display, and publishing that data to a pub/sub topic.
Follow the steps described on the gcloud Installation Guide to install gcloud on the raspberry pi. That is going to be used to authenticate the raspberry pi against google cloud. NOTE: If you don't want to install gcloud on the device, you can use a service account and an application credential instead
Execute the folowing command to authenticate against Google Cloud Platform.
gcloud auth application-default login
The code of the Client that is going to be used to publish the pub/sub messages is in the iot_client.py file, and all its dependencies are defined in requirements.txt.
Clone or Download both files to your device, and edit the TOPIC-NAME on line 11 to match the name of the pub/sub topic created on step 1.
To install the pubsub client library:
pip install -r requirements.txt
Follow the steps described here to install the rainbow-hat library.
To execute the client:
python iot_client.py
You should see the current temperature on the display of the Rainbow HAT, and the json payload sent to pub/sub on the output of the terminal every 0.5 seconds.
Also, if you go to the Stackdriver Logging viewer, you should see the logs for the cloud function that is processing those messages.
To create the federated table go to the BigQuery Console and create a dataset in your project.
Create a new table on that dataset with the following options:
- Location: Google Cloud Bigtable with this source
https://googleapis.com/bigtable/projects/<MY-PROJECT>/instances/<MY-CLOUD-BIGTABLE-INSTANCE>/tables/<MY-CLOUD-BIGTABLE-TABLE>
- On the Column Family and Qualifiers section, click on Edit as Text and paste the following JSON.
[
{
"familyId": "data",
"type": "BYTES",
"encoding": "TEXT",
"onlyReadLatest": false,
"columns": [
{
"qualifierString": "temperature",
"type": "FLOAT",
"encoding": "TEXT"
},
{
"qualifierString": "pressure",
"type": "FLOAT",
"encoding": "TEXT"
}
]
}
]
- Make sure to select "Read row key as string" on the Options section.
Click create table.
Finaly execute this query to get some data from your data table using SQL.
SELECT LEFT(rowkey,10) as key,
data.temperature.cell.timestamp,
data.temperature.cell.value ,
data.pressure.cell.value
FROM [<MY-PROJECT>:<MY-DATASET>.data]
LIMIT 100
Try to create a dashboard in Data Studio showing a summary of the data that you are gathering from your IoT devices.
Create a web application using the pub/sub client library and a subscription to your TOPIC, to show the measurements in real time.