Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(aws-sdk): avoid repeating MessageAttributeNames in sqs receiveMessage #1044

Merged
merged 12 commits into from
Jun 10, 2022

Conversation

JamieDanielson
Copy link
Member

Which problem is this PR solving?

If duplicate requests are storing the same value multiple times, this will address the problem by passing the MessageAttributeNames through a Set to de-duplicate them.

It's unclear why these duplicates are happening and adding the same value multiple times, so that is still an issue to be looked into. It is not solved by this PR. This PR is also not attempting to measure the size of the attributes to determine whether the size is under the required 256KB.

Short description of the changes

  • Use a set to ensure unique values when concatenating propagation fields in the AWS SQS instrumentation.

Checklist

  • Ran npm run test-all-versions for the edited package(s) on the latest commit if applicable.

Co-authored-by: Kent Quirk <kentquirk@users.noreply.github.com>
Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>
Co-authored-by: Mike Goldsmith <mikegoldsmith@users.noreply.github.com>
@codecov
Copy link

codecov bot commented May 31, 2022

Codecov Report

Merging #1044 (f3e4e41) into main (23d98b1) will increase coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1044      +/-   ##
==========================================
+ Coverage   95.91%   95.95%   +0.04%     
==========================================
  Files          13       24      +11     
  Lines         856     1410     +554     
  Branches      178      295     +117     
==========================================
+ Hits          821     1353     +532     
- Misses         35       57      +22     
Impacted Files Coverage Δ
...entation-aws-sdk/src/services/MessageAttributes.ts 91.89% <100.00%> (ø)
...emetry-instrumentation-aws-sdk/src/services/sqs.ts 87.87% <100.00%> (ø)
...opentelemetry-instrumentation-aws-sdk/src/enums.ts 100.00% <0.00%> (ø)
...etry-instrumentation-aws-sdk/src/services/index.ts 100.00% <0.00%> (ø)
...ntation-aws-sdk/src/services/ServicesExtensions.ts 100.00% <0.00%> (ø)
...try-instrumentation-aws-sdk/src/services/lambda.ts 97.77% <0.00%> (ø)
...tapackages/auto-instrumentations-node/src/utils.ts 97.95% <0.00%> (ø)
...entelemetry-instrumentation-aws-sdk/src/aws-sdk.ts 97.44% <0.00%> (ø)
... and 3 more

Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix
Can you please add a test for this case?

@JamieDanielson
Copy link
Member Author

Thanks for the fix Can you please add a test for this case?

We're having some trouble figuring out how to test MessageAttributeNames, and part of that may be a misunderstanding in how it works.

Our understanding at the moment is that MessageAttributeNames is set on ReceiveMessage to indicate the specific message attribute names to receive (and thus filter other message attributes). However, we cannot find a place in the code that is actually filtering the attribute names that are omitted from this.

In sqs.test.ts, we tried setting the MessageAttributeNames on .receiveMessage, but still see the message attributes coming through in the response that should be filtered out. We can tell when logging out MessageAttributeNames that we are de-duping, but are unclear where to access this in the response so we can test it.

To help us move forward, we'd love some help answering the following questions:

  1. Since we can't access MessageAttributeNames in the response, what would be a good test to see if MessageAttributeNames are being de-duped correctly?
  2. Where does MessageAttributeNames get used to filter the message attributes?

JamieDanielson and others added 2 commits June 3, 2022 18:05
Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>
@blumamir
Copy link
Member

blumamir commented Jun 5, 2022

Thank you for doing the research

Our understanding at the moment is that MessageAttributeNames is set on ReceiveMessage to indicate the specific message attribute names to receive (and thus filter other message attributes).

If I understand SQS correctly, this filtering is done in the server itself, so the request contains which message attributes to include in the response and then the server copies them to "receiveMessages" response.

However, we cannot find a place in the code that is actually filtering the attribute names that are omitted from this.

The tests for this instrumentation use a mock to return responses and validate the request parameters. This is done, for example, here. I think a proper test might be to assert that the request that is "sent" to the mock server does not repeat the propagators fields in MessageAttributeNames.

In sqs.test.ts, we tried setting the MessageAttributeNames on .receiveMessage, but still see the message attributes coming through in the response that should be filtered out. We can tell when logging out MessageAttributeNames that we are de-duping, but are unclear where to access this in the response so we can test it.

I don't know how an actual SQS server is implemented to behave if multiple names are sent in the request, it could repeat them in response or just return each one of them once. The original issue if I understood it correctly is due to the request itself being invalid due to the size (These attribute names are just keep piling up forever until the limit is hit). The response in the test is mocked and will not follow the semantics of AWS.

To help us move forward, we'd love some help answering the following questions:

  1. Since we can't access MessageAttributeNames in the response, what would be a good test to see if MessageAttributeNames are being de-duped correctly?
  2. Where does MessageAttributeNames get used to filter the message attributes?

I can think of 2 possible ways to test it:

  1. Add assert on the "request" as it arrives to the mock server. If you do that, please make sure that a failed "expect" in the callback will catch correctly by mocha. I can see it is already "expecting" in the mock, but we had bugs like this in the past so might be worth making sure.
  2. Extract this new functionality into a function in ./src/services/MessageAttributes.ts. Then you can just invoke the function directly with any interesting input and expect the result to contain each attribute name only once. There are already similar tests here. We can add a new "describe" for a function like addPropagationMessageAttributeNames(origMessageAttributeNames: string[]) which will implement this logic:
return Array.from(
            new Set([
              ...(origMessageAttributeNames ?? []),
              ...propagation.fields(),
            ])
          );

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jun 6, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: JamieDanielson / name: Jamie Danielson (b0fed52, 80e50df, b747f84)
  • ✅ login: MikeGoldsmith / name: Mike Goldsmith (7155304)
  • ✅ login: pkanal / name: Purvi Kanal (d7db9a3)

… tests

Co-authored-by: Jamie Danielson <JamieDanielson@users.noreply.github.com>
Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the test
Added 2 suggestions

Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>
@marshally
Copy link

I just wanted to leave a quick note indicating that we have tested this patch and I can confirm it resolves the issue described in #1038

Thank you @JamieDanielson, we will look forward to this getting this merged and released.

Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for addressing all comments 💯
Left a small optional nit about the function name

@blumamir blumamir changed the title fix(instrumentation-aws-sdk): dedupe prop fields fix(aws-sdk): avoid repeating MessageAttributeNames in sqs receiveMessage Jun 9, 2022
@blumamir
Copy link
Member

blumamir commented Jun 9, 2022

@JamieDanielson @pkanal , I think this is ready to be merged.
I can't merge while "This branch is out-of-date with the base branch", so please rebase this PR or "allow edits from maintainers", so we can rebase before merging.

Thanks again for working on this fix.

@blumamir blumamir merged commit 4b4ded6 into open-telemetry:main Jun 10, 2022
@dyladan dyladan mentioned this pull request Jun 10, 2022
@MikeGoldsmith MikeGoldsmith deleted the aws-sdk-sqs branch June 10, 2022 20:36
marshally pushed a commit to RapidAPI/opentelemetry-js-contrib that referenced this pull request Jun 13, 2022
…sage (open-telemetry#1044)

* fix(instrumentation-aws-sdk): dedupe prop fields

Co-authored-by: Kent Quirk <kentquirk@users.noreply.github.com>
Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>
Co-authored-by: Mike Goldsmith <mikegoldsmith@users.noreply.github.com>

* fix linting

* add test for message attributes

Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>

* test both messages

* deduplicate message attribute names in seperate function and add unit tests

Co-authored-by: Jamie Danielson <JamieDanielson@users.noreply.github.com>

* remove line to force ci checks

* appease linter

Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>

* appease linter

* move logic into dedupe function and add unit tests

* rename helper function to be more descriptive

Co-authored-by: Kent Quirk <kentquirk@users.noreply.github.com>
Co-authored-by: Purvi Kanal <pkanal@users.noreply.github.com>
Co-authored-by: Mike Goldsmith <mikegoldsmith@users.noreply.github.com>
Co-authored-by: Mike Goldsmth <goldsmith.mike@gmail.com>
Co-authored-by: Purvi Kanal <purvikanal@honeycomb.io>
Co-authored-by: Jamie Danielson <JamieDanielson@users.noreply.github.com>
Co-authored-by: Purvi Kanal <kanal.purvi@gmail.com>
@dyladan dyladan mentioned this pull request Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQS ReceiveMessages crashes with 413 REQUEST ENTITY TOO LARGE
7 participants