Smart home analytics with Hubitat

I recently moved to a new house and (re-)started adding “smarts” to it, such as smart bulbs, smart switches, and various sensors. Trying to avoid YAML as much as I can, I opted to skip the Home Assistant route and instead continued with Hubitat Elevation, which is an offline-first home automation hub. (Hubs are often used to connect smart devices and set up automation flows based on triggers like button presses or motion detection.)

In my new house, I have more sensors, more opportunities for automation, and more need for information. For example, I would like to know how often (and for how long) the sump pump runs. I’d also like to figure out how to get the basement temperature closer to that of the first and second floors. The first step, then, is to gather data and get it into a format that I can make use of. That’s what this blog post is about.

The challenge

Hubitat is very good at connecting to various disparate ecosystems of devices. Almost any ZigBee or Z-Wave device is supported. In addition, custom apps and custom device drivers can be written to support other devices, including ones on a local (or remote) network.

Unfortunately, there is virtually no support for graphing/charting or analytics of any kind: you simply have a history of events for each connected device. Some community attempts have been made to remedy this limitation, but I wasn’t overly impressed with them. The interface just isn’t geared toward such things.

The challenge is getting all of the raw data out of the Hubitat and into a system that can process this data and output usable information. I happened to stumble upon a custom Hubitat app for exporting data to InfluxDB, and this is how my journey into smart home analytics began.

The tech stack

I’m a big fan of containerization, so Docker Engine is pretty much a must for me these days. As mentioned above, InfluxDB is the time series database that’s being used, mostly because of the fact that I found a Hubitat app that can forward events to it (and didn’t have to write any custom integration code myself). For displaying the data, I chose Grafana because it’s a pretty popular solution for time series visualizations.

Naturally, the Hubitat Elevation hub is being used as the integration point for all of the smart home devices. I’m currently using the C-5 model, which is a bit outdated, but so far has been working just fine for my needs. Once support for Thread/Matter comes out for the new C-8 model, I might upgrade to it.

Configuration

InfluxDB and Grafana are both running in Docker containers, using a basic Docker Compose file to expose their ports for access (and a network for them to talk to each other). The InfluxDB Logger app for Hubitat is then configured to send data from various sensors to the InfluxDB instance.

The Docker Compose file looks something like this:

version: '3'
services:
  influxdb:
    restart: always
    image: influxdb:2.7-alpine
    volumes:
      - influxdb-data:/var/lib/influxdb2
    ports:
      - "8086:8086"
    environment:
      - 'DOCKER_INFLUXDB_INIT_MODE=setup'
      - 'DOCKER_INFLUXDB_INIT_USERNAME=admin'
      - 'DOCKER_INFLUXDB_INIT_PASSWORD=my-admin-password'
      - 'DOCKER_INFLUXDB_INIT_ORG=org'
      - 'DOCKER_INFLUXDB_INIT_BUCKET=bucket'
      - 'DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=my-admin-token'

  grafana:
    restart: always
    image: grafana/grafana-oss:9.4.7
    volumes:
      - grafana-data:/var/lib/grafana
    ports:
      - "80:3000"

volumes:
  influxdb-data:
  grafana-data:

The next step is the fun part: getting visualizations working. As I’m using InfluxDB 2.x with its native Grafana integration, Grafana wants me to use the Flux query language. I mostly got things working by trial and error. I can pretty much guarantee that there are better ways to structure the queries, but I’m not familiar enough with internal data structures or the Flux language to do a better job of it at the moment.

Basic visualizations

A simple time series chart of temperatures is a great way to start. Here is the Flux query:

from(bucket: "bucket")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "temperature" and
    r._field == "value"
  )

The above query yields something like this:

Grafana time series graph with temperature values and a less than pleasant legend

To organize and rename the values to something more user-friendly, Grafana has a Transform tab for adding transformations to the retrieved data. In this case, I’m using a Join transform and an Organize transform:

Grafana Transform tab and two transformations in it

With these transforms in place, the resulting graph looks pretty good!

New and improved Grafana graph, with a better legend

Sump pump usage visualizations

My sump pump is connected to a smart switch with power monitoring. The power usage gets piped into InfluxDB. A raw graph of this isn’t entirely unhelpful:

Time series graph with sump pump wattage

However, it would be nice to get this into a more useful format, such as, how many times the sump pump has actually turned on in a given period. Here is the Flux query I came up with for that:

from(bucket: "bucket")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "power" and
    r._field == "value"
  )
  |> stateCount(fn: (r) => (r._value > 0))
  |> filter(fn: (r) => r.stateCount == 1)
  |> map(fn: (r) => ({r with _value: r.stateCount}))
  |> keep(columns: ["_time", "_value"])
  |> aggregateWindow(every: 1h, fn: sum, timeSrc: "_start")

This is how the graph looks for the past two days:

Bar graph with count of sump pump activations per hour

Much better! Once again, there is probably a smarter way to do these queries; I’m simply not familiar enough with Flux to optimize them.

Next steps

Now that I have information being graphed, I need to keep monitoring for a while and determine what patterns emerge. Based on those patterns, I should be able to make changes, when necessary, to improve the conditions in my house.