Skip to content

Latest commit

 

History

History
30 lines (19 loc) · 2.18 KB

after_an_incident.md

File metadata and controls

30 lines (19 loc) · 2.18 KB

Information on what to do after a major incident. Our followup and after action review procedures.

Followup Actions for Response Roles

In addition to any direct followup items generated from an incident, each of our response roles will have a few standard followup tasks. These are generally lightweight actions that ensure we organize information and followup with customers appropriately.

Steps for On-Call Person

  1. Create the post-mortem page from the template, and assign an owner to the post-mortem for the incident.

    • Add tasks of cleanup actions. (If you are not free, assign someone to handle these tasks)
    • Contact the customer if post-tasks require service restarts or changes to customer systems.
    • Review the Slack Communication and save any needed/relivent information.
  2. Send out an internal email to our group (support@mnxsolutions.com) that we had an incident, provide a link to the post-mortem or a discription of what happened and what post-tasks are needed.

    • Others should provide feedback/help as needed.
  3. Occasionally check on the progress of the post-mortem to ensure that it is completed within the desired time frame.

  4. Follow up with the customer that post-tasks have fixed the issue and everything is all clear.

Reviewing the Incident

It's important that we review the incident in detail to see exactly what went wrong, why it went wrong, and what we can do to make sure it doesn't happen again. These take many names; after-action reviews, incident review, followup review, etc. We use the term post-mortem.

You can read all about our post-mortem process, which goes over this in more detail.

Reviewing the Process

As well as reviewing the incident, it's important to review our process. Did we handle the incident well, or are there things we could have done better?

This review isn't very formal yet, and typically involves a few of the incident commanders getting together to discuss how we might have done things differently, or if there are any tweaks we can make to our incident response process.

If you're interested in joining these meetings, just let one of the incident commanders know and we'll be sure to invite you.