Back to Insights

Insight

Reading a Maximo Monitor anomaly without panicking the night shift

How to design IBM Maximo Monitor anomaly rules that the night shift trusts: what an anomaly actually means, how to triage it, and how to keep alerts from becoming wallpaper.

Cover image — Reading a Maximo Monitor anomaly without panicking the night shift
IBM Maximo Application SuiteMaximo MonitorOperationsReliability

A Maximo Monitor anomaly is not the same thing as a SCADA alarm. The shift supervisor who treats it as one will either spend their night chasing nothing or, worse, stop reading them. Neither outcome is what the operator paid for. Designing the anomaly rules and the operations workflow around them is, in our experience, the single biggest factor in whether IBM Maximo Monitor earns its licence after go-live.

This is what we have learned from running Monitor on real estates. None of it is theoretical.

An anomaly is a pattern, not an event

A SCADA alarm fires when an instrumented value crosses a threshold. The semantics are clear and the response is well understood: investigate, acknowledge, act, log. The shift team has decades of muscle memory on this.

A Maximo Monitor anomaly is different. It fires when a pattern across one or several signals — a trend, a deviation from baseline, a correlation that should hold and does not — looks unusual against the asset’s recent history. It does not necessarily mean a value has crossed a threshold. It means something has changed that the model thinks is worth a look.

This is a different kind of signal. If the operator presents it as if it were a SCADA alarm, the shift team treats it like one. Within a fortnight, they have either accumulated a backlog of “investigate” tickets they cannot close, or they have started ignoring the alerts entirely. Both outcomes are fatal to the programme.

Design the anomaly classes for the response, not the data

The mistake we see most often is configuring anomalies because the data supports them. Vibration deviations, pressure trends, temperature drift, all available, all interesting, all configured. Within a month the operations team is drowning.

The discipline that works is the inverse. Start with the response. What action is the night shift expected to take when an anomaly fires? If the answer is “investigate when convenient”, that is not an anomaly worth firing in real time — it is an entry on the next shift handover. If the answer is “create a work order for the day shift”, that is what the rule should do automatically. If the answer is “stop the asset and call the duty manager”, that is a different category entirely and probably should not be in Monitor at all.

In practice this produces three classes:

  • Operational notification. Surfaces on the dashboard, no immediate action, reviewed at shift handover.
  • Work order trigger. Creates or enriches a work order in Manage automatically, day-shift response.
  • Escalation. Creates an alert with a named owner, response within an agreed window. These are rare by design.

If everything is class three, the model has been calibrated wrong. Tune.

Anomalies that fire only on specific operating modes

A pump that runs three hours a day at design point and twenty-one hours a day at part-load has two utterly different baselines. An anomaly model that does not know which mode the asset is in will produce false positives every time the operating mode changes.

The fix is to know what the asset is doing when the model evaluates the signal. Operating mode, load, ambient conditions, time of day. This is data engineering, not analytics. Often it requires bringing additional signal in from the historian or from Manage work records — for example, “the asset has been in maintenance mode for the last six hours; suppress this anomaly class during that time”.

Without this, the night shift will quickly learn that anomalies fire predictably at 06:30 every morning when the line ramps up, and they will stop reading them.

Suppress aggressively in the first 90 days

The first ninety days after Monitor go-live are when the shift team forms their habit. If those ninety days are noisy, they form the habit of ignoring the alerts. If those ninety days are quiet but accurate, they form the habit of reading them.

We bias aggressively towards suppression in those first ninety days. Anomalies are tuned against historical data, then operated in a “shadow mode” where they fire to a separate review queue rather than to the dashboard. The reliability function reviews them daily, calibrates rules, and graduates them to the operational dashboard only when the false-positive rate is acceptable.

This is unfashionable. It also works.

The handover into Manage is the value, not the dashboard

A Monitor dashboard that the night shift looks at is useful. A Monitor anomaly that creates a work order in Manage with the asset, the symptom, the recommended check and the relevant operational context already attached is transformational. The day shift planner does not start from “investigate this asset” — they start from “here is the work, here is why”.

Where Monitor produces a dashboard but no work, it is a parallel system. Where it produces work, it is an extension of the operating model. The integration design is the difference, and it is rarely the part suppliers volunteer up front.

Alert fatigue is the failure mode to fear

Every Monitor implementation we have seen fail has failed for the same reason. The alerts became wallpaper. The shift team stopped reading them. The reliability function stopped reviewing them. The dashboards still existed, the data still flowed, and the platform was technically operational, but it was no longer informing decisions.

Avoiding this is unglamorous work. Tune. Suppress. Calibrate. Review the false-positive rate weekly. Cut anomaly classes that nobody acts on. Promote the ones that produce trusted work. The point of Monitor is not the platform; it is the operating habit it builds in the people running the asset.

Closing position

Monitor anomalies are a different kind of signal from SCADA alarms, and they need a different operating discipline around them. Design the rules from the response, not from the data. Calibrate against operating mode. Suppress aggressively in the early days. Make the handover into Manage the value. Watch the false-positive rate every week.

The night shift will tell you whether you have got it right. They are also the people who decide whether your Monitor programme is a success.

For the broader implementation pattern, see IBM Maximo Monitor: implementation, integration and managed services and the MAS suite overview.