Skip to content

What to do during an incident

Ian Nice edited this page Sep 5, 2024 · 41 revisions

Security incidents and data breaches

You should follow the normal process below but you also need to follow the GDS guidance and specifically:

  1. If the incident is related to cyber security then you should report this to the Cyber Security team as soon as possible.

  2. If the incident involves a data breach it should be reported to the GDS Privacy team at gds-privacy-office@digital.cabinet-office.gov.uk.

  3. If the incident involves a data breach it should be reported to the Cabinet Office Data Protection Officer at dpo@cabinetoffice.gov.uk. This doesn't need to be done until we've reached a conclusion of what data has been lost. Check with the GDS Privacy team as they might do this on our behalf.

  4. If the incident involves spamming or phishing from our platform inform NCSC at report@phishing.gov.uk after you have gathered information on who and what was sent.

You may need to do just one of these or you may need to do all of them depending on the incident.

We also suggest you include the GDS Information Assurance team (gds-information-assurance@digital.cabinet-office.gov.uk) in either situation.

Normal incidents

It's better to assume something is an incident and start documenting it early.

  1. Nominate an incident lead and comms lead and write their names in slack:

    • The incident lead should prioritise technical investigation and fixes.
    • The comms lead should be responsible for communicating the incident to both our users and internal stakeholders.
    • The comms lead should ensure events are well documented in an incident report.
    • The comms lead may decide to delegate note taking in the incident report to another team member if they have too much to do.
    • If an incident is severe or high profile, ask a product manager or designer to help with public comms.
  2. Comms lead to begin note taking in a Google Doc and share in #govuk-notify-incident:

  3. Join the standing incident Meet and remind people you're in there:

    • Only the comms lead and the incident lead need to be in the Meet.
    • Other people can listen in for info or if one of the incident people asks for help. Having a standing Meet allows people to join easily and find where the discussion is happening
  4. Comms lead to update Statuspage if necessary.

  5. Once the incident investigation is in a good state and we understand the impact the comms lead should notify GDS stakeholders about the incident. You should assume the readers have minimal knowledge of Notify so make sure in your email you are clear about what the user impact is in plain english. Comms to our users should take priority over comms to GDS stakeholders.

After the incident

After the incident, the incident lead should:

  • Schedule a meeting to review the report.
  • Ask a tech lead to update the DSP Monthly Incident Review document. Update the incidents to review table for the next meeting with the team name, date of incident, incident priority, link to any incident actions and one line description of the incident (with a link to the incident report).

Escalate to someone senior for help

If you are struggling to resolve the incident and want to call in some support then you can use our team contact details. It is better to escalate and find out you didn't need it in the end then not to escalate at all.

If you need a Senior Civil Servant (someone in Digital Service Platforms senior management team) to help with a P1 incident there is a Senior Civil Servant (SCS) pagerduty rota. You can read more about what they would be expecting if called.

  • In PagerDuty, select "New Incident"
  • Enter "GOV.UK Notify" in the "Impacted Service" field
  • Assign to "DSP SCS escalation" (or GaaP SCS escalation if you can't find the DSP one)
  • Fill in other fields as needed, and select "Create Incident"

Severity of incidents

This is a non complete list of different severity incidents that might happen.

Severity Description Response time (time to open your laptop, post something in Slack that you are looking at this and start investigating)
P1 API is unavailable 30 minutes
P1 www.notifications.service.gov.uk is unavailable 30 minutes
P1 Text message or email sending is unavailable 30 minutes
P2 Letter sending is unavailable 30 minutes
P2 Delivery receipts are unavailable 30 minutes
P2 Service callbacks are unavailable 30 minutes
P2 Inbound text messages are unavailable 30 minutes
P2 Sending documents by email is unavailable 30 minutes
P2 Downloading documents sent by email is unavailable 30 minutes
P2 Severe delays to text message or email sending 30 minutes
P3 Severe delays to letter sending Next working day
P3 Other minor degraded service Next working day
P4 Incident with no user impact (such as Concourse unavailable Next working day

Guidance and links

Clone this wiki locally