If you could reduce Mean Time To Repair (MTTR) down to a fraction of what it is now would you do so? Of course, anyone would slash MTTR down if they could and there are a lot of companies who promise just that. Before we look at MTTR reduction, let’s first get a stake in the ground around the process steps that consume time behind the MTTR measure. These are generally considered to be: Detect, Isolate, Analyze and Remediate. We will discuss ‘DETECT’ In Part 1 of 4 related posts.
Traditionally, the process entails trolling through log-files after something has gone wrong in the hopes that we can isolate one or more codes and then research them to see what they mean. This all takes time and occurs after the fact when the MTTR clock is already ticking.
The ConsoleWorks Platform has a different take on MTTR that can greatly reduce MTTR across a wide variety of issues. This capability is called Intelligent Event Modules (IEMs). ConsoleWorks’ base technology under IEMs is the ability to define patterns that represent events of importance that can be generated by a hardware or software asset. These events are embedded in all hardware and software in an IT/OT infrastructure by the manufacturer. If we set up these codes in ConsoleWorks IEMs, we can identify these events.
The IEMs encode the event patterns of the predefined manufacturer’s events along with each event’s severity rating and a human-readable description. ConsoleWorks scans log-file changes in real-time (log-files are tailed to stream new information directly to ConsoleWorks) to detect any events that have occurred.
The automated process greatly reduces the MTTR by proactively detecting events (Detect) in milliseconds as compared to waiting for a failure to occur then trolling logs to find out what might have happened.
So at this point our MTTR impact is as follows: Detect = 0, Isolate = ?, Analyze = ? and Remediate = ?