Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing calls to create services dependancies #321

Closed
msuterski opened this issue Apr 15, 2021 · 7 comments
Closed

Failing calls to create services dependancies #321

msuterski opened this issue Apr 15, 2021 · 7 comments

Comments

@msuterski
Copy link

We're having a very inconsistent results with calls to create dependancies between services using AssociateServiceDependencies

The order of execution is:

  1. Create a technical service
  2. Check if business service exists
  3. Check if just created technical service exists
  4. Create a dependency between them.

At least half of the calls to create the dependency result in Not Found (code: 2100) error with no indication of what is actually missing. The *http.Response object is always nil. The 2100 code says nothing useful.

I'm not sure if the problem is with underlying REST API or with how the package processes the responses. Given the *http.Response object returned from AssociateServiceDependencies is nil, it makes it difficult to debug what's going on.

I wonder if there is a way to expose more details from the package about what is going on. I did not see any debug flags that would let us get more insight into the actual responses form the REST API.

Simplified code snippet of what is happening:

    svc, err := s.api.CreateService(payload)
    if err != nil {
        return nil, fmt.Errorf("%v", err)
    }
    _, _, err := s.api.GetBusinessService(businessServiceID)
    if err != nil {
        return fmt.Errorf("%v", err)
    }

    service, err := s.api.GetService(technicalServiceID, &pdAPI.GetServiceOptions{})
    if err != nil {
        return fmt.Errorf("%v", err)
    }
    
    payload := buildAssociationPayload(businessServiceID, technicalServiceID)
    _, resp, err := s.api.AssociateServiceDependencies(&payload)
    if err != nil {
        return fmt.Errorf("%v", err)
    }

@theckman
Copy link
Collaborator

@msuterski which version of Go and this library are you using? That'll help me hone-in on the code to be looking at.

The inconsistency of it, and with it being a NotFound error from the API, sounds like it's not at our layer but at the API. I can take a look to see if there's anything this library may be doing with that call path that looks suspect, but by and large this package is just a pass-through to the REST API.

@msuterski
Copy link
Author

@theckman thanks for a quick response.

We're using a fork that this PR was based on #304 (until v1.4 gets released) and golang:1.16.

Would it be helpful if we opened a support ticket with PD about it?

@theckman
Copy link
Collaborator

theckman commented Apr 18, 2021

@msuterski so I have a few learnings and suggestions.

The *http.Response being nil makes sense, but is unfortunate. The internals of the library never intended for the *http.Response to be exposed, but at some point that was added to some methods with the assumption it was. I plan to remove this completely in a future release, while providing other debug mechanisms so that it can work reliably.

The other is that looking at the code, I cannot see how any values could be transposed or mixed up in the call to PagerDuty's API. So this makes me think that it's either something in your code (as you generate the payloads), or something in their API (consistency?) that's making the item be not found.

Would you be willing to try the latest commit on master? In between v1.3.0 and now I've added some enhancements to the error handling / parsing, and so maybe the error will now contain a more helpful message. You can also get at the parsed JSON error from PagerDuty, so if there are multiple errors you can see each. We're very close to v1.4.0, and so I'd say that it's safe to use (and if you find an issue, it'd be present in v1.4.0). If that error doesn't give you anything more helpful, it might be a good idea to ping PagerDuty support.

You can use the errors.As() function in the standard library, with a value of this type, to get at those extra details. Here is an example of using errors.As: https://golang.org/pkg/errors/#example_As

Let me know what you end up finding / doing, and I'll keep this issue open until I hear from you.

@theckman
Copy link
Collaborator

theckman commented May 8, 2021

@msuterski wanted to check in and see if you'd made any progress on this.

@msuterski
Copy link
Author

@theckman we added backoff and retries to our code and the failures are less frequent. I'll updated to the latest release in upcoming weeks and let you know if we still see retries being triggered.

@theckman
Copy link
Collaborator

@msuterski thought I'd check in and see how it was going. Based on your last update, that makes it sound like an eventual consistency thing on PagerDuty's side. But please correct me if I misinterpreted your last update.

@theckman
Copy link
Collaborator

@msuterski I'm going to close this issue at this time. Feel free to open a new one if you do experience more issues and believe it to be somehow related to this library.

If you do continue to experience that specific problem, it may be good to reach out to PagerDuty support directly at support@pagerduty.com.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants