Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address changes to PagerDuty integration #5980

Closed
2 of 5 tasks
dms1981 opened this issue Jan 15, 2024 · 17 comments
Closed
2 of 5 tasks

Address changes to PagerDuty integration #5980

dms1981 opened this issue Jan 15, 2024 · 17 comments
Assignees
Labels
alerting technical debt This issue is either technical debt or an issue that will lead to technical debt as time goes by.

Comments

@dms1981
Copy link
Contributor

dms1981 commented Jan 15, 2024

User Story

As a Modernisation Platform Engineer
I need to look at how we use PagerDuty integration keys with SNS / other AWS code
So that we're not caught out by PagerDuty's deprecation of Rulesets and Event Rules at the end of 2024

Value / Purpose

From PagerDuty's documentation, it's recommended that users switch from Rulesets / Event Rules to Event Orchestration.

We can see that the Terraform PagerDuty provider is currently warning us about the deprecated use of the integration_key argument which I believe is linked to this PagerDuty provider change..

We should take steps, if needed, to update our PagerDuty integration to make use of Event Orchestration before this argument is removed from the PagerDuty provider / any integration with PagerDuty itself is removed

Useful Contacts

No response

Additional Information

No response

Proposal / Unknowns

No response

Definition of Done

  • Necessary changes assessed
  • Changes implemented
  • User docs have been updated
  • Another team member has reviewed
  • Tests are green
@dms1981 dms1981 added needs refining technical debt This issue is either technical debt or an issue that will lead to technical debt as time goes by. alerting labels Jan 15, 2024
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity.

@github-actions github-actions bot added the Stale label Apr 18, 2024
@SteveLinden
Copy link
Contributor

SteveLinden commented Aug 27, 2024

Initial checks indicate that this is used in hundreds of location (pagerduty_service_integration) just within modernisation-platform repo

@SteveLinden SteveLinden self-assigned this Aug 27, 2024
@github-actions github-actions bot removed the Stale label Aug 28, 2024
@dms1981
Copy link
Contributor Author

dms1981 commented Sep 2, 2024

Moved to blocked until Steve returns

@SteveLinden
Copy link
Contributor

Amended the pagerduty code in modernisation-platform to pick set up sprinkler for testing purposes. Unfortunately this has also created many other items in pagerduty so we may need to consider removing those that are not needed later.

The plan is to test changes against sprinkler to see if the proposed changes will work as required. This will, hopefully, stop other systems being amended/hit with the same changes.

@SteveLinden
Copy link
Contributor

It looks like the manual run, to add new items to pagerduty worked, or there were no errors send to pagerduty.

I have also amended the PR so that it contains the correct location for Sprinkler. This was also run manually so it will point to #modernisation-platform-low-priority-alarms.

Testing will not commence on the changes required to correct the code to point at event orchestration.

@SteveLinden
Copy link
Contributor

SteveLinden commented Sep 12, 2024

A different approach was taken
Sprinkler was added in the current vise and this was completed through PR #7909 which was successful

Added orchestration using PR #7912 which, again, was successful.

Now I am looking to see if the settings in aws.tf (e.g. corporate-staff-rostering-preproduction = pagerduty_service_integration.integrations["corporate-staff-rostering-preproduction"].integration_key) can be amended to also use pagerduty_event_orchestration. At the moment I am struggling to get sprinkler set up "as-is" in here.

@SteveLinden
Copy link
Contributor

Checked in chatgpt and gemini and neither suggested the integration was being removed.

@SteveLinden
Copy link
Contributor

A change has been made to the code to set the PagerDuty code to point to orchestration from the original for cdpt_chaps_cloudwatch. This is being moved under #7976 to test the change works and does not cause issues. The aws.tf code mentioned above will be amended as I go along (later)

@SteveLinden
Copy link
Contributor

The latest version was pushed but Dave noticed some items were missing compared to main. Had a look and copied those back in. Hopefully the PR (#8044 ) is correct now.

@SteveLinden
Copy link
Contributor

Various issues were popping up with the name I had used in 8048. This was reverted and the new code put in. It was simply the "name" needed to be something not used elsewhere and, as the one I was using was used in many places, we got an error.

It was amended to
name = "cdpt_ifs_cloudwatch orchestration integration"

@SteveLinden
Copy link
Contributor

Re-ran the code for just cdpt again, removing the changes I'd made for delius non prod.
PR was #8158

@ep-93
Copy link
Contributor

ep-93 commented Oct 9, 2024

Talked to Steve to workout a handover. Change has been made to one team, @markgov was part of this. Sprinkler and cdpt have been done which can be found here -

https://github.com/ministryofjustice/modernisation-platform/blob/ab4aad22e88fce9233b90db3fa699c3d76608559/terraform/pagerduty/aws.tf#L67C47-L67C88

Delius is next, the code needs to be uncommented. https://github.com/ministryofjustice/modernisation-platform/pull/8158/files

Will catch up with Mark about this more tomorrow.

Alistair Curtis is testing CDPT IFS, need to catch up with him to see if its what is expected before deploying to other teams.

Last contact from Alistair

I’ve checked PD, and there are no alerts for IFS. The system is Integrated Fraud System, used by a few people on fraud investigation teams mainly in LAA & HMCTS. The PD alert was working in August, it was triggered by someone running a large report and the LB timing out. Previous to that, my test route would trigger it by returning a 500. cdpt-chaps has the same setup, but doesn’t have the test route, so it’s difficult to test. - I could add one, but I’m in the thick of some other work at the moment.

@ep-93 ep-93 self-assigned this Oct 11, 2024
@ep-93
Copy link
Contributor

ep-93 commented Oct 11, 2024

Changes had broken a teams non prod alarms, I fixed this but we need to investigate

@ep-93
Copy link
Contributor

ep-93 commented Oct 17, 2024

Integrations modernisation-platform-orchestration

See instructions in how to integrate with any system. The integration key must be part of the API v2 payload.

API v2 payload = Changes to the payload :(

@ep-93
Copy link
Contributor

ep-93 commented Oct 17, 2024

We need to edit this module - https://github.com/ministryofjustice/modernisation-platform-terraform-pagerduty-integration/tree/main

Instead of associating SNS topics with PagerDuty services, we need to associate them with orchestration integrations I believe.

@ep-93
Copy link
Contributor

ep-93 commented Oct 21, 2024

Ive discussed with Dave Elliot and David Sibley.

https://support.pagerduty.com/main/docs/amazon-cloudwatch-integration-guide#integrate-with-rulesets

Although this worried us PagerDuty/terraform-provider-pagerduty#775

Above, you can see the deprecation is about integration with rulesets, rather than service integration in what we do.

I cant see any deprecation warnings here - https://github.com/ministryofjustice/modernisation-platform/actions/workflows/terraform-pagerduty.yml

So have been advised to close the ticket, if our code was going to be deprecated, we would see it in that workflows output.

@ep-93 ep-93 closed this as completed Oct 21, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Modernisation Platform Oct 21, 2024
@ep-93 ep-93 reopened this Oct 21, 2024
@github-project-automation github-project-automation bot moved this from Done to To Do in Modernisation Platform Oct 21, 2024
@mikereiddigital
Copy link
Contributor

Have reviewed. Moving to done.

@mikereiddigital mikereiddigital closed this as completed by moving to Done in Modernisation Platform Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alerting technical debt This issue is either technical debt or an issue that will lead to technical debt as time goes by.
Projects
Status: Done
Development

No branches or pull requests

5 participants