About Burnout in Cybersecurity

Earlier this year, Johan Berggren and I presented at Black Hat EU on the topic of responder burnout. I had a wonderful time presenting and there is a recording of our talk available, but for those who prefer to consume things like this in writing, I wanted to follow it up with this blog post.

Neither Johan nor I are psychologists or trained in therapy, so please don’t take this as clinical advice. Our observations about burnout and its causes come from long careers in response and an enduring, intense focus on the issue within our own teams. Additionally, Johan draws on many conversations with his wife (an emergency response (ER) Nurse) and I draw on my almost 30-year history as a volunteer Fire & Rescue responder.





In the Cyber Security industry, if you pay attention to the social spaces you will find many references to happy dumpster fires and the “this is fine” dog. People often talk of dropping out of tech to start farming, or of living in the wilderness. These memes and the numerous escape fantasies regularly professed in informal spaces have a common driver: Burnout.

While many teams perform operational work and these stressors may resonate broadly with them, our talk and the specific lens of this post will remain fixed on Digital Forensics and Incident Response (DFIR), where there are some particularly unique stressors.

Situational vs Chronic Burnout

When we examine the various causes of burnout, they seem to fall more or less into two categories.

First there is situational burnout - which we define as a short term form of burnout caused by a particular and specific situation. For DFIR responders this may be caused by a case that hits particularly close to home or is exceptionally bad. Many Digital Forensic Analysts find this in their first CSAM case, for example. In a more corporate environment situational burnout can often affect the people who don’t respond to incidents on a regular basis and are pulled into a high-stress situation by the needs of the incident. Sysadmins, PR and comms professionals, and other engineers from the company might be pressed into being part of an incident where the stress, ambiguity and fast pace are highly unusual and uncomfortable.

One burnout factor that's pretty unique to situational burnout is inexperience. While experienced responders build up (hopefully positive) coping mechanisms over time for the stress of the job, a person who has never experienced the situation before is in a much more vulnerable spot for experiencing situational burnout. The overwhelming stressor exists, but coping mechanisms and resilience haven't been built up yet to defray the impacts.

Then there’s chronic burnout - the sort of creeping drain on motivation and energy that comes from dealing with problems for a sustained period of time with no visible or expected end. Chronic burnout tends to grow the most in the team members who are on the hook for response as a long-term commitment. It is particularly frustrating to reach the end of a long, tiring, complicated case and know for sure that another case is just around the corner, waiting to drop or already sitting in the queue.

Factors which can contribute to Burnout


Our talk covered a few very common factors which we identified as contributing to causing burnout. This isn’t likely an exhaustive list, but it’s a good starting point that likely applies pretty broadly.


Unclear Mission & Expectations: Security is a broad and very busy, rapidly-changing industry. A team which doesn’t have a specific and clear mission to anchor to is at risk of constant scope creep and changing focus. This lack of certainty in their role is highly stressful to responders. Like having a clear mission, responders need to have clear expectations set for what “good ops work” looks like. Standards written down in advance can help ensure everyone trusts that their work will be assessed fairly and consistently regardless of what other stress is occurring around them.

Lack of Control: One of the bigger stressors I’ve seen in practice in ops teams is that we cannot control what cases come in, when they arrive, or how critical they will be determined to be. This lack of control over our workload, combined with an inability to say no to taking on an important case, can leave us feeling helpless and disempowered over our fate.

Opaque Management: When so much of our job is unpredictable, responders place a lot of value on predictable work conditions. I suspect this is because people have only so much tolerance for change, and when the work itself is constantly changing it becomes even more important to have consistency elsewhere. This predictability can be thrown into chaos when work conditions (shift hours, priorities, this quarter’s projects, etc) are changed without notice or explanation. While leaders can’t always ensure organizational changes come with plenty of notice and buy-in, we can ensure that we’re transparent about what is happening, why, and what it means for the team and give everyone as much notice and explanation as possible.

Resource Starvation: Something every ops team everywhere is likely familiar with, it often seems we don’t have enough people or resources to do all the work expected of us. A management team unaccustomed to performing operational work might expect everyone’s time to “add up to 100%” and may assume a split amount of time spent between ops and other work. When ops load becomes higher than expected, it is too easy for an organization to still expect the other work to be completed even though the time commitment now adds up to more than 100%.

Lack of agency / autonomy: Most of the people who get into DFIR are highly skilled and capable. The more senior ones likely reached those senior levels through correct exercise of judgment in complicated circumstances. One factor I’ve noticed contributing to burnout among these staff is a perceived lack of autonomy. This feeling arises when their decisions are constantly criticized, or when other organizational partners exercise control over DFIR work without understanding the work itself. Another risk factor is when a drive to professionalize turns into a practice of proceduralizing. For more on this distinction, see “Professionalize, don’t proceduralize, your ops team”.

Overwhelming Scope: When a system is so large or complex that no individual can understand all of its workings, but has not been broken down into understandable component parts with responsible teams for each component, it can become a significant stress factor for oncallers who must be accountable for responding to emergencies but struggle to debug or keep up with the system itself. As a parallel example, when wildfires in California grow so large that no single agency can control them, this is often dealt with by implementing ICS “Unified Command” and separating a large complex of fires into independently-managed incidents while a single “Area Command” coordinates them all. In practice, this takes an incident so large that it is unmanageable and returns it to manageable components.

Signs & Symptoms of Burnout


Signs and symptoms of burnout in your staff will likely vary with each individual, but here’s a few things we think were common threads that you should be vigilant for.

Situational burnout


Situational burnout can be hard to spot because it can happen much more quickly, with less time to notice and react. You’ll most typically see situational burnout manifest as an increasing amount of simple or careless mistakes. Data isn’t checked, communications are overly casual, assumptions take the place of fact-finding, etc. Sarcasm and complaining during handoffs and communications can be a late sign of situational burnout. “Zoning out” or appearing distracted or disinterested during syncs can also signal burnout in progress.

Chronic burnout


Because it’s longer-lasting and slower-burning, we are able to highlight more signs of chronic burnout and to do so in more detail. Here are some of the signs we have seen thus far.

Urgency Fatigue: Treating everything as urgent initially can lead to treating everything as routine eventually. Burnt-out teams may overlook high-priority incidents and focus solely on reactive tasks, neglecting proactive measures like threat hunting. This can occur because the team is too exhausted to “look for more trouble” when they just got done dealing with a tough case.

Lack of Passion: Things that were once fun and exciting (novel cases, new tricks attackers are using, etc.) aren’t anymore. When team members are more likely to groan about ‘another search to perform’ rather than be interested in a novel persistence mechanism, it’s likely burnout manifesting as a lack of curiosity or interest in the job.

Team dis-unity: Team members who are suffering from chronic burnout may become defensive about their work and scope out of fear of the unknown. The desire to help another team member who is struggling competes with feeling personally overloaded. This can manifest as infighting, loss of trust among other team members, or failure to support one another when needed.

Cynical Spiraling: A team member who begins to feel hopeless about the circumstances of their work (an end result of many of the factors above going unchecked) may begin to express cynicism about the job, the team, the company or the whole industry. I’ve gone into much greater detail about cynicism spirals in another article, which you can read here at “Recognizing dark humor and cynicism”.

Things Leaders can do to help


Successful leadership of a response team includes committing to an active and sustained effort to control burnout. Here are a few recommendations we have - but please share your own experiences with me in the comments, I would love to build this list out further!


Provide a clear mission - and continually reinforce it

To help combat the threat of scope and mission creep commonly experienced by ops teams who get things done, ensure that your team frequently sees a clear message reinforced as to what the core mission of your team is. Then, try your best to ensure your organization adheres to this as well! In any busy and successful ops team the boundaries will shift and change with the needs of the organization but accepting this and then reorienting back to the core mission at regular intervals will help ensure that ops flexibility doesn’t turn into a muddled mission.


Lead by example

Team leads and managers should be continually setting the example for what good ops and partner engagement look like. Set high standards for the team leads and consistently reward them being done well. Especially as a team manager, it can be helpful to still perform an ops rotation once in a while, or work some lower priority tickets yourself, so the team sees that you aren’t asking them to do any work you wouldn’t do yourself.

For non-operational team culture issues, be a visible example of what you want to see. In addition to ops excellence, you need to frequently reinforce your team’s cultural expectations around things like work-life balance, working hours, etc. A leader I highly respect once told me “You can tell your team something once, but they won’t believe you until they’ve seen and heard it 10 times.”

My own email signature has a link to the “Matt’s work-life-balance” doc which explains that I like to flex between work and home as the time suits me, but that doesn’t mean if I send an email at 9pm I expect to see a response before the next work day. I frequently remind people that if anything is ever urgent, I will always page them - so if I am sending chats or emails and they aren’t a page, they can wait for a response. And then, when something urgent happens, I must remember to page them so they have the experience of being paged for urgent things.


Insist on downtime

If you visit a good firehouse, you will likely notice a recreational room with sofas, a TV, likely a game system or at least a pool table or basketball hoop on the outside patio. While station staff may be at the station for a full 12 hour shift, no one is expecting them to be working the full 12 hours without a break. After the tools are maintained, the chores are done, and the training is completed, whatever ‘extra’ time is uncommitted to actual calls for service should be committed to relaxing a bit and keeping stress levels lower, bonding as a team, etc.

As a DFIR lead you should advocate as much as possible for the team’s hours to not be 100% committed. An ops team experiences surges and lulls and are often much harder to plan project work for, compared to a team with a quarterly feature delivery cycle. A best practice is to reserve about 20% of your team’s hours as non-committed time. If the ops shifts stay peaceful for a whole quarter, you can over-deliver on promised projects. But if, as is the norm, there is a sudden log4shell or xz or some other serious issue, your team’s extra ops load won’t turn into late nights, weekends, or slipping on other important work.


Create conditions for success when needed

During the weeks following the collapse of the World Trade Center on 9/11, the search teams looking for survivors noticed that, the search dogs had begun getting depressed. They lacked energy, weren’t into the work with enthusiasm, and were hard to motivate. Some even lost their appetites. Eventually it was realized that the live-search dogs train for and are rewarded for finding live victims. After long periods of not finding any live victims, the dogs were demotivated. A quick-thinking team decided that the teams needed to have some volunteers go hide in the rubble for the dogs to “find” - and be rewarded for discovering. After including this intermittent positive activity in search protocols, the dogs’ morale picked back up.

In-house DFIR Teams may go sustained periods without discovering exciting attacker tools in action or proving a case for a criminal suspect. In that in-between time the leads of the team need to ensure that everyone gets to practice key skills and celebrate catching live attackers. If you have a red team, they can come in very handy for this! Running a red team exercise for the response team to find and catch can provide great practice and a morale boost for the hunting team - provided they aren’t already in a state of resource exhaustion! As a last resort, entering a CTF together as a team could potentially bring a breath of fresh air into the environment.


Provide as much control as possible

Several of the burnout factors mentioned above can be partly remedied by ensuring your team has as much control over their own domain as possible. One thing you can do to help is to be extremely judicious about how long and detailed playbooks or response plans need to be, and ensure that you are giving your staff enough autonomy. If a playbook doesn’t make sense or situations have changed, editing and changing the playbook should be simple and free of bureaucracy. If a playbook step doesn’t make sense for the case at hand, ignoring it should be assumed valid by the responder as long as a quick comment in the case log explains why that step was ignored. During many cases, we keep a “Postmortem notes” file that everyone contributes in-the-moment notes in, so that we can potentially identify obsolete playbooks and steps that can be removed.

If you’re lucky enough to be able to develop on your own response tooling, this is a great way to provide autonomy for your team that many teams may not get. Waiting for a commercial tool to implement a feature you need or, worse, finding the feature is broken and providing wrong info is a highly frustrating thing. When you control the tooling, you can immediately fix things that bother you or begin to implement tooling that covers the gap. One of the really special things that the Google Response team has going for it is that we have focused over the years on hiring Security Engineers and expecting them to do some tool development along with the ops work. While the main goal is to have our own tooling be robust and dependable, this also provides a needed outlet for the autonomy that they crave.

Finally, it would be a huge omission not to mention that your team needs a sense of control over the direction of the team itself. While a fire engine can’t have two dozen captains, everyone should be given the opportunity to at least provide feedback and opinions on what the team is doing and how it will do so in the future, at regular intervals.


Make team resourcing clear to everyone

While you may not be empowered to obtain more staff or more funding (and I’m sure you’re trying that!) You can still move the needle by being transparent about what resources you’ve requested, what the organization’s response was, and how you intend to use the resources available. Along with a clear mission and priorities, a clear understanding of how the team is going about trying to improve the situation will return at least some sense of control to the individual team members. Believe it or not, your team may not believe you’re trying as hard as you are unless they can get a peek behind the curtain. Include both what’s going on and why, along with background context, for maximum effect.

This also applies to partner teams who bring work your way. When new investigations or assessments begin, the partner team who is asking for them (perhaps management or legal) aren’t practitioners and may not understand how big a task they’ve asked for. Making the resource commitments clear to them can result in more help, or realigning the work to fit within what you’ve got available.


Encourage breaks, hobbies, vacations, and balance

It is common in our industry to see people advocate that security engineers should be doing open-source development work while not in their day job, and should be attending every conference and every meet-up. I strongly believe that an all-consuming focus on security is unnecessary and unhealthy. The happiest and most successful responders I know may think about security often, but all have fulfilling hobbies and interests that have nothing to do with their work. For example, my good friend Ryan and I meet often to practice playing music together and in groups. While we do find opportunities to tie this back to work sometimes (see our paper on musicianship and its relation to high-performing DFIR practitioners) this is personal fulfillment and isn’t meant to be useful to security.

As a leader, push back on this silly idea that we should all be security people all the time. We are security responders, yes - but we are artists and historians, poets, musicians, bodybuilders, boxers, volunteer firefighters, and ultra-marathon runners too. Those aspects of ourselves bring us out of the stress and negativity of our profession, nourish and replenish our souls, and give us things to bond over as partners and peers at work too! I recently took a 9 week summer vacation - my longest break away from work by far, ever. In addition to the new sense of energy I have returning to work, I was also able to write a debrief for my team on why everyone should do this once in a while, and show them by example that the workplace will get by just fine without you for a little bit while you tend to your own emotional needs.

Anecdotally, I have also shared with my team that I have regular sessions to talk with a therapist about stressors in my life, and have heard positive feedback from team members that they felt more likely to do the same on hearing that a senior lead has a positive opinion about doing so.


Destigmatize struggling with motivation

Many responders have reported during discussions about burnout that they feel admitting their moments of demotivation seems like admitting a weakness or defeat. Everyone struggles at various points, sometimes daily, with the natural highs and lows of working a complex and stressful job. An important thing leaders can do is to be transparent about our own struggles with motivation and to destigmatize moments of struggle. I’ve noticed after talking openly in group chats about when I am taking breaks “because I just can’t focus right now” that other members of my team opened up about similar struggles. As a group we discussed taking breaks and how difficult it is to force yourself to focus, compared to accepting the lack of focus in the moment, redirecting our energy, and coming back to the focus task a bit later.


Controlling burnout is a commitment, not a task.


If you leave this article with nothing else, please leave it knowing that controlling burnout in a response team isn’t a task you do once and then call it complete. It’s a commitment to knowing the factors that contribute to burnout both as a team and as individuals, and to keeping a constant eye on those things. Intervene early, not late, and ensure you have proactive plans for how you want your team to operate sustainably in this field.

Do you have ideas on how to control burnout? Factors I’ve missed, or things you think I got wrong? I’m eager to hear them and expand on this topic more. Feel free to reach out via email, the socials, etc.

Now if you’ll excuse me, I have a cello to re-string.

Comments

Popular posts from this blog

Parsing the $MFT NTFS metadata file

Incident Response in the Cloud

Container Forensics with Docker Explorer