ISO 14224 in Practice: Failure Data That Earns Its Keep

ISO 14224 is the international standard for collecting and exchanging reliability and maintenance data for equipment. It was written for petroleum, petrochemical and natural gas, but the taxonomy, equipment boundaries and failure mode definitions are used well beyond oil and gas, including in power, water, mining and heavy manufacturing. Most large operators say they have adopted it. Far fewer can show, on demand, that a single PM has been changed, a spare stocked or a refurbishment justified because of what ISO 14224 data revealed. That is the gap this article is about.

The point of the standard is not the binder. It is a shared “reliability language” that lets a planner, a reliability engineer, a regulator and an external benchmark such as OREDA describe the same failure in the same way. Treated as a compliance exercise, ISO 14224 becomes another schema that maintenance teams resent. Treated as the structural backbone of a reliability programme, it changes how work is scoped, how spares are held and how capital is defended.

What ISO 14224 actually specifies

The standard, current as ISO 14224:2016 and confirmed by ISO in 2022, sets out four things that matter at the front line.

An equipment taxonomy with up to nine hierarchical levels. The upper levels classify by industry, plant and section. The lower levels describe the equipment unit, sub-unit, component and maintainable item.
Equipment boundary definitions that say where one item of equipment ends and the next begins. This is the part that planners under-estimate. Without a boundary, two sites will record the same failure against different parents and reliability comparisons become meaningless.
A normative set of failure modes (for example external leakage, abnormal instrument reading, spurious operation, structural deficiency) and failure mechanisms.
Maintenance data definitions covering categories (corrective, preventive, modification), priorities, downtime and active repair time, so cost and availability can be derived from the same record.

The standard is deliberately not a CMMS configuration guide. It tells you what to record, with what precision, against what hierarchy. How that lands in IBM Maximo, SAP PM or any other system is an implementation decision. Done well, the data structure inside the CMMS becomes the standard rather than mirroring it in a parallel spreadsheet that no one updates.

Why most adoptions stall short of the benefit

Programmes that claim ISO 14224 alignment but never see the operational return usually fail in one of five places.

Taxonomy without boundaries

Teams import a taxonomy from a vendor template, assign equipment classes to existing asset records, and stop. Without explicit equipment boundary statements, every site interprets “the pump” differently. One records the seal flush plan as part of the pump. The next records it as a separate utility. Failure rates calculated across the population are not comparable, so reliability engineering quietly stops using them.

Failure code lists that nobody can navigate

ISO 14224 offers a comprehensive failure mode set. Dropped into a CMMS unedited, the list contains dozens of options per equipment class. A technician closing a work order at the end of a night shift will pick the first plausible entry, every time. The result is a few codes carrying most of the volume and a long tail of single-use codes that no analysis can use.

Capture at close-out instead of at notification

Failure data captured only when a work order is closed is filtered through whatever the planner remembers and whatever the supervisor approves. The richest information, the symptom and the immediate effect, is best captured at the point of notification by the person who saw the equipment misbehave. Programmes that defer all coding to close-out lose the symptom and inherit the convenient code.

No feedback to the people doing the coding

Technicians and planners code thousands of work orders a year. If they never see the analysis that comes out the other end, they cannot tell whether their coding helped or hindered. Coding quality decays for the same reason any unobserved data quality decays: there is no signal that it matters.

Sample sizes that cannot support a decision

A single facility rarely sees enough failures on a given equipment class to compute defensible failure rates on its own. Without aggregation across a fleet or across an external dataset such as OREDA, the numbers look statistically thin. Teams stop quoting them in capital cases, and the standard loses its strongest argument.

A working approach to ISO 14224

The practical pattern in operators who get value from the standard is consistent and unglamorous.

Scope by criticality. Apply the full standard to the equipment classes that matter most, typically the upper bands of the criticality framework. Lower-criticality classes can use a reduced failure mode list. Trying to code everything to the same depth is the surest way to code nothing well.
Write boundary statements in plain language. For each in-scope equipment class, define what is included and what is not. Pumps include the seal and bearing housing but not the upstream isolation valve. Compressors include the lube oil console but not the inlet filter. Publish this where planners and technicians can find it inside the CMMS, not on a shared drive.
Shorten the failure mode list per class. Take the normative list, keep the modes that have ever been observed or are credibly possible, and remove the rest from the user-facing dropdown. A class-specific list of eight to twelve modes outperforms a generic list of forty every time.
Capture at notification. Make failure mode, symptom and effect mandatory at the notification step, with sensible defaults that the planner can refine. The technician closes the loop with the cause and the action taken. Two-step capture splits the work between the people who know each part.
Close the loop quarterly. Reliability engineering publishes a short note per equipment class: top failure modes, trend versus last quarter, any PM or spares change driven by the data. Coders see their work used. Coding quality stops decaying.

These steps depend on the same operational ownership described in the wider piece on asset data governance. ISO 14224 without named stewards behaves exactly like every other standard without named stewards.

Where the data starts to pay back

Once the data is trustworthy at the equipment-class level, four conversations get easier.

PM optimisation. Failure mode distributions show which preventive tasks are catching real failures and which are not. PM intervals stop being copied from the manufacturer manual and start being defended from local evidence.
Bad-actor analysis. A short list of tags accounts for a disproportionate share of unplanned outage hours on most plants. ISO 14224-coded data makes that list visible and ranks it by cost and risk rather than by complaint volume.
Spares and refurbishment cases. Mean time between failure by failure mode is what underwrites a min-max change or a refurbishment programme. Without it, spares budgets are negotiated from anecdote.
External benchmarking. Operators in oil and gas can compare against OREDA. Operators in other sectors can at least compare across their own fleet on a like-for-like basis, which is impossible without a common boundary and code set.

These are also the inputs that platforms such as Maximo Health and Maximo Predict need to behave reliably. A predictive model trained on inconsistent failure codes will produce inconsistent predictions. The standard is, in effect, the data contract that AI-driven reliability tooling depends on.

What good looks like a year in

A programme that has taken ISO 14224 seriously for a year tends to look the same across sectors. In-scope equipment classes have published boundary statements and short failure mode lists. Notifications carry symptom and effect before they reach the planner. Reliability engineering publishes quarterly notes that name the bad actors and the changes made in response. Coding compliance is measured. External benchmarks appear in capital papers without being challenged on definitions.

That is the test. If the failure data is changing maintenance decisions, the programme is working. If it is not, the binder is being audited and the plant is still being run on memory.