You can’t Incident Command an email thread

Copied on February 28, 2024 from https://medium.com/@matt_97344/you-cant-incident-command-an-email-thread-9b46ba35f298 with permission. Authored by Matt Linton

For many years I’ve been advocating that the Incident Response community adopt the formal incident management methods outlined in NIMS “Incident Command System” (ICS) and I’m thrilled to say that in recent years that adoption has been spreading and providing real benefits to IR.

However, along with the success of any framework come the inevitable “Bad practice” antipatterns. One of these is the general misunderstanding of what an “Incident Commander” is — which spreads mostly through scope creep of a successful framework (ICS) into situations it’s less well-suited for (non-urgent issues and task management).

If you want to stop reading here, the TLDR is: You cannot Incident Command an email thread. Either your incident is urgent and requires ICS or it’s not, and needs some other process.

First, a few fundamentals:

For something to be an Incident there are a few necessary conditions; The event must be unexpected, the need to resolve it urgent, and the solution is not readily apparent to the people who have observed the issue.
The purpose of an Incident Management framework is to resolve such things as quickly and correctly as possible
ICS (where ‘Incident Commander’ comes from¹) is meant to provide a framework to establish excellence in the “Three C’s” of IR: Command, Control, and Communications — with that third C usually being the hardest one to get right.

When you’re using ICS to manage an urgent issue, your needs for all three of the C’s are likely to be fast-moving, dynamic, and require a lot of interaction. The closer to “realtime” those interactions are, the better! I find in-person or video coordination rooms to be the most highly effective, with real-time chats (IRC, Slack, Google Chat, etc) to be a less great but still functional equivalent. High-latency or High-loss mediums (Email, polycom phone bridges, etc) are typically both ineffective and exhausting.

The point being: If it’s an incident you likely need fast and dynamic action among responders and rapidly-evolving communications to key stakeholders.

Now, onto the antipattern: I’ve noticed in recent years that some teams who have adopted ICS (i.e, the term “Incident Commander” and some of the ICS fundamentals like span of control) have turned it into a process rather than a framework — and use ICS routinely for all their case work. Such teams may have “Incident commander” as their job title, or as the default description of their oncall responder. Or, they might use ‘IC’ or ‘Incident Commander’ to describe the owner of a ticket, whatever the priority level of that ticket.

Teams who engage in this antipattern are likely to find themselves describing their role as “Incident Commander” but, recognizing that the case they’re running isn’t urgent or a crisis, will turn to managing it as though it isn’t one. That can manifest as long-running email threads where the issue is being discussed but not managed. Or, chats in which discussion about an issue is taking place cooperatively but no one is assertively leading — checking often on statuses, assigning objectives, managing a tree of responsible persons, etc.

In short, if you are describing yourself as the “Incident Commander” but you aren’t actively and frequently engaging in Commanding, Controlling, and Communicating — you’re misusing the framework!

Why care though, if things are working OK? Well, there are a few long-term negative consequences to this mistake.

First — use of the term “Incident Commander” to describe a role is intentional and should already be conveying some key information: That there is an urgent issue going on, that the person named as the IC is leading a team to fix it, and that things may be moving rapidly. The first few times you declare an Incident and treat the situation as not-an-incident, you’ll get cooperation but that cooperation will slowly deteriorate until people no longer take the role seriously. They are getting a wrong signal for criticality and will become less able to separate critical from noncritical from this signal.

Second — the expectation your company should set when an Incident Commander reaches out for help from any other Engineer is that their help is necessary to resolve something very urgent which could not otherwise be suitably resolved. As with the first consequence, the assumption that an IC will only be present for urgent things will be dissolved, and then engineers who are truly needed for urgent things will be harder to raise cooperation from.

In both cases, misuse of the framework for non-urgent things causes a loss of credibility which leads to downstream consequences that weaken your incident response framework. As partner teams lose confidence that your calling upon them is meaningfully urgent, they will begin to ignore or down-prioritize you.

My strong recommendation for incident response teams is to take the signaling inherent in Incident Command seriously and maintain the following framework hygiene:

“Incident Commander” is a role, appointed only when a task is really an incident and the three C’s are a critical component of resolving it — Incident Commander is not a job title!
If things are not truly urgent, then you don’t need and shouldn’t have an Incident Commander — use ‘Project lead’ or ‘Ticket assignee’ or ‘Oncaller’ or ‘Responder’ or any other term to convey meaning to the role at its own seriousness level.

Remember, the purpose of an ICS framework is to provide exactly as much structure as needed to resolve an urgent matter in an organized manner — but no more than that!

¹ The UK uses a similar “commander” terminology in their “Gold, Silver, Bronze” command hierarchy but I believe most of the tech industry who have adopted Incident Commander terms, have taken them from NIMS ICS

Open Source DFIR

You can’t Incident Command an email thread

You can’t Incident Command an email thread

Comments

Post a Comment

Popular posts from this blog

Parsing the $MFT NTFS metadata file

Incident Response in the Cloud

Container Forensics with Docker Explorer