Grafana plus InfluxDB on $5.50/mo, against the vendor 'machine health platform'

Where issue 04 left off

Issue 04 put a HiveMQ Community Edition broker on a $5.50/mo Hetzner VM, with the Variscite i.MX 8M Plus carrier from issue 03 publishing five spindle variables as Sparkplug B frames once per 100 ms. The broker is up. The carrier is publishing. The variables are on the wire.

A broker by itself is not a dashboard. A subscriber that consumes those frames and turns them into something an operator looks at on a screen, and an alert that pages a reliability engineer when the anomaly line trips, is what closes the loop. This issue puts that subscriber layer on the same VM. Three pieces of open-source software: Telegraf to consume the Sparkplug frames, InfluxDB OSS 2.x to store them as a time series, and Grafana OSS to render dashboards and fire alerts.

Same VM, same broker, same five variables, same 72-hour test window as last week.

The stack

Telegraf, an Apache-2.0 licensed agent maintained by InfluxData, runs as the Sparkplug B subscriber. The MQTT Consumer input plugin subscribes to the broker on spBv1.0/#. Telegraf's Sparkplug B parser, shipped in agent versions 1.27 and later, decodes the protobuf payload into native fields. Telegraf then writes those fields to InfluxDB via the InfluxDB v2 output plugin.

InfluxDB OSS 2.7, MIT-licensed, runs on the same VM as the broker. The single-bucket configuration stores all five spindle variables in one measurement, tagged by site, line, machine, and spindle. Retention is set to 90 days with Task-based downsampling that rolls one-second raw data into one-minute aggregates after 14 days. The same 90 days that the vendor SaaS retains at full resolution.

Grafana OSS 10.4, AGPL-licensed, runs on the same VM. Two dashboards: one operator-facing (current spindle state, last-24-hour vibration trend, current anomaly score, last-7-day alarm count) and one engineer-facing (full envelope-spectrum bands, drive-current correlation, model-confidence overlay, anomaly-vs-current scatter). Grafana Alerting is configured with three rules: anomaly score above threshold for more than 60 seconds, RMS velocity above ISO 10816 Class II machine alarm threshold (7.1 mm/s for the spindle in question), and drive-current excursion above 110% of nameplate for more than 30 seconds. Alert routing is to a Pushover endpoint (~$5 one-time fee per device) plus an SMTP relay.

The whole install runs as four Docker containers behind Caddy as a reverse proxy with TLS from Let's Encrypt. The compose file is under 80 lines. Bring-up time, measured on the second attempt after a configuration error the first time, was 47 minutes.

What the dashboard renders

The operator dashboard, refreshed every five seconds, shows: a single anomaly-state badge (nominal / watch / alarm) driven by the carrier's bit line and the model's confidence; a 24-hour stacked bar of operating time vs idle vs alarm minutes; the live RMS velocity trace against the ISO 10816 thresholds; and the live drive-current value against nameplate. Four panels, no chrome.

The engineer dashboard, refreshed every second, exposes: the five Sparkplug variables as separate time series with selectable rolling windows; the envelope-spectrum band-power ratios plotted as a stacked area showing how the energy migrates across bands as the spindle bearing degrades; the model anomaly score with the model-confidence band overlaid; and an anomaly-vs-current scatter with the last 30 days of points, color-coded by hour-of-day. Ten panels, more chrome.

The alert rules fire on the operator dashboard's anomaly badge and on the RMS-velocity panel. Both surfaces are also wired to the same Grafana Alerting back end, so the same rule that lights a red panel in Grafana also routes a notification to the on-call engineer. No two alerting paths to maintain.

What this costs

Item	Cost
Hetzner CX22 VM (4 vCPU, 8 GB RAM, 80 GB SSD, 20 TB egress)	$5.50/mo
InfluxDB OSS license	$0 (MIT)
Grafana OSS license	$0 (AGPL-3.0)
Telegraf license	$0 (MIT)
HiveMQ CE license	$0 (open source)
Caddy license	$0 (Apache-2.0)
Pushover, lifetime device license	$5.00 one-time per device
Let's Encrypt TLS certificate	$0
One day of engineering setup time	$1,000 (loaded labor estimate, not cash)
First-year total cash outlay	$71.00
Year-two total cash outlay	$66.00

The model itself, the carrier, the OPC UA stack between the PLC and the carrier, and the Sparkplug B publisher were all built in issues 02 through 04. The observability layer this issue adds is the cheapest of the five infrastructure layers.

The vendor comparison

Three SaaS vendors in the machine health monitoring segment published list pricing or have responded to public quote requests in the past 18 months. The composite, reconstructed from publicly available pricing pages and from request-a-quote literature on vendor sites and on G2, looks like the table below. Vendor names omitted because the published numbers are subject to negotiated rates and the comparison is the segment as a whole, not any single vendor.

Vendor SaaS feature	Open stack equivalent
Vibration sensor included	Bring your own ($300 for the IEPE accelerometer and signal conditioner)
4G/LTE edge gateway included	Bring your own (the i.MX 8M Plus carrier, $2,847 once)
ISO 10816 thresholds preset	Configured manually in Grafana (15 minutes)
Anomaly detection model included	Built on the carrier (issue 03)
Cloud dashboard	Grafana on the Hetzner VM
90-day data retention	Same, configured in InfluxDB
SMS / email alerts	Pushover + SMTP via Grafana Alerting
Mobile app	Grafana mobile-web (no native app)
Vendor-managed updates	Self-managed (Watchtower or manual `docker compose pull`)
Vendor support SLA	None (community Slack, GitHub issues)
Per-asset annual price (typical)	$5.50/mo VM, divided across N assets
Per-asset annual price (typical, vendor)	$1,000 to $1,500/year/asset

At 10 assets the open stack runs roughly $0.55 per asset per month. The vendor stack runs roughly $100 per asset per month. The break-even point, where the loaded labor cost of one engineer-day of open-stack setup equals one year of vendor SaaS for one asset, is approximately the first asset.

What the vendor quote actually buys

The savings above are real for a plant willing to own its tooling. The vendor quote buys three things the open stack does not.

The first is the sensor. Most vendor packages bundle a magnetic-mount triaxial accelerometer with a 4G/LTE gateway in a single sealed housing, shipped pre-paired and pre-calibrated. The sensor is the part of the open stack a maintenance shop is least likely to want to source, mount, calibrate, and replace itself. A $300 IEPE accelerometer plus its signal-conditioner card plus a 30-minute mounting procedure is not the same product experience as opening a box and sticking a sensor on the machine. For one-asset deployments the vendor's sensor-plus-gateway integration is a real value-add. For 100-asset deployments it is a recurring tax.

The second is the SLA. When the open stack's broker dies at 2 AM, the answer is "log in and restart it." When the vendor SaaS dies at 2 AM the answer is "page the vendor's support desk." For some plants the SLA is a hard requirement, especially on safety-graded assets and on regulated processes. For most automation-engineering applications the SLA is not the differentiator the vendor's sales material implies.

The third is the analytics layer. Vendor SaaS products in this segment ship with pre-built failure-mode classifiers (bearing wear, misalignment, imbalance, lubrication starvation), failure-mode-to-work-order automation, and integration with CMMS systems like Fiix, MaintainX, and eMaint. The open stack supplies the data and the dashboards. The classifiers and the CMMS-automation layer are the next decision after this one, and they are not in the $5.50/mo budget.

The decision rule

Run the open stack when the assets in scope are more than five, when the engineering team has the bandwidth to own its own observability layer, and when integration to a non-vendor CMMS is required anyway. Run the vendor SaaS when the asset count is small, when the maintenance shop is uncomfortable sourcing the sensor, or when the SLA is a hard contractual requirement. The break-even is asset count and engineering-team posture, not unit economics.

The defensible mistake either way is treating the dashboard as the deliverable. The dashboard is the visible artifact. The deliverables are reduced unplanned downtime, faster mean-time-to-detect, and a CMMS that gets the right work order on the right asset at the right time. Both stacks can get there. Neither stack gets there because of the dashboard alone.

Next issue

Issue 06 closes the loop. The Grafana alert that fires when the spindle vibration crosses the ISO 10816 alarm threshold should write a work order into a CMMS, not just notify the on-call engineer. The next issue wires Grafana Alerting to a Fiix CMMS via webhook, with the alert payload populating asset, fault code, priority, and recommended-action fields. The comparison: the same workflow inside a vendor SaaS that ships the CMMS in the same purchase. Ships next Monday.

Setpoint — the 15-minute weekly brief for industrial automation engineers. setpoint.news · Independent · Weekly.

Methodology

Sources used. InfluxData official pages for InfluxDB OSS, Telegraf, and the Sparkplug B parser plugin; Grafana Labs official documentation and licensing pages; HiveMQ Community Edition product page; Hetzner Cloud public pricing; Pushover pricing page; Let's Encrypt project page; Caddy project page; ISO 10816 standard description; G2 predictive-maintenance category pricing band; vendor public pricing pages and quote-request literature reviewed in aggregate. When verified. May 2026; observability stack was installed and exercised on the same Hetzner VM that ran the broker described in issue 04, with the i.MX 8M Plus carrier from issue 03 publishing the variables. The 72-hour test window overlapped the issue-04 test. Methodology notes. The vendor per-asset price band is a composite of publicly available pricing tiers and is not attributed to any single vendor; individual deals vary materially. The labor estimate is a loaded-cost estimate from the editor's own engineering rate and is not an audit figure. Editorial process. Bench built, measured, and dashboarded in the editor's own shop. Single-author draft, second-pass review for citation density and unverifiable specifics. Disclosures. None. Setpoint accepts no advertising and no affiliate revenue.