Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update argument parsing to remove dependency on yargs and make more robust #1019

Merged

Conversation

AndreKurait
Copy link
Member

@AndreKurait AndreKurait commented Sep 26, 2024

Description

Simplify CDK ExtraArg processing to remove yargs and make more robust.

Fixed places where parameter values which may have spaces weren't wrapped in quotes.

Removed cdk dependency on yargs

Fixed an issue where List arguments would split on , incorrectly e.g. --setHeader name "value1,value2"
Fixed an issue where ExtraArgs wouldn't correctly handle multi arity commands e.g. --setHeader name value

Fixed a bug in how user supplied plaintext password was being supplied to the replayer

  • Category: Bug Fix
  • Why these changes are required? Currently --setHeader in extraArgs is broken along with special characters in the values.
  • What is the old behavior before changes and new behavior after changes? CaptureProxy could crash on startup with otherwise valid arguments.

Issues Resolved

https://opensearch.atlassian.net/browse/MIGRATIONS-2025

Is this a backport? If so, please add backport PR # and/or commits #

Testing

Unit testing, AWS Testing

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…obust

Signed-off-by: Andre Kurait <akurait@amazon.com>
// Extract command prefix
const commandPrefix = baseCommand.substring(0, baseCommand.indexOf('--')).trim();
const baseArgs = baseCommand.substring(baseCommand.indexOf('--'));
export function parseArgsToDict(argString: string | undefined): Record<string, string[]> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: We're not using the values (only keys) as of now (except the unit tests). I wanted to keep this since it's written, working, and tested

let replayerCommand = `/runJavaWithClasspath.sh org.opensearch.migrations.replay.TrafficReplayer ${osClusterEndpoint} --insecure --kafka-traffic-brokers ${brokerEndpoints} --kafka-traffic-topic logging-traffic-topic --kafka-traffic-group-id ${groupId}`
let command = `/runJavaWithClasspath.sh org.opensearch.migrations.replay.TrafficReplayer ${osClusterEndpoint}`
const extraArgsDict = parseArgsToDict(props.extraArgs)
command = appendArgIfNotInExtraArgs(command, extraArgsDict, "--insecure")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: In a followup, we need to figure out how a customer might exclude this, most likely through a cluster setting in cdk.context.json

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like that's the way to go - we exclude kafkaConnection in a similar manner in the code above.

Signed-off-by: Andre Kurait <akurait@amazon.com>
Copy link

codecov bot commented Sep 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.30%. Comparing base (3ec8dcd) to head (8d106a9).
Report is 12 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1019      +/-   ##
============================================
+ Coverage     80.20%   80.30%   +0.10%     
- Complexity     2722     2727       +5     
============================================
  Files           365      366       +1     
  Lines         13603    13605       +2     
  Branches        941      941              
============================================
+ Hits          10910    10926      +16     
+ Misses         2118     2104      -14     
  Partials        575      575              
Flag Coverage Δ
gradle-test 78.34% <100.00%> (+0.12%) ⬆️
python-test 89.91% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}
// Split based on '--' at the start of the string or preceded by whitespace, use non-capturing groups to include -- in parts
const parts = argString.split(/(?=\s--|^--)/).filter(Boolean);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worst case scenario here is that you could misparse something like

--setHeader from-client \"Search Agent --alpine --classic\"

and you'd then drop any arguments that already existed that were named '--alpine' or '--classic', right?

I'd like to see the map eventually supported as just a map of keys and values (possibly the empty string) - and all of it is just concatenated together w/ spaces between. That way, it's clear what keys are being used and what to strike.

Given that the code was even more problematic before, I'm fine w/ this being a follow up task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a change in the customer cdk.context.json to have extra args be a map and not a string?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the worst case, something like --setHeader from-client \"Search Agent --destinationUri \"
An existing --classic in your case wouldn't be dropped because of the " character after

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct - to accept the value as a dictionary/map and not just as a string (I'd support both using instanceof). For later though.

let replayerCommand = `/runJavaWithClasspath.sh org.opensearch.migrations.replay.TrafficReplayer ${osClusterEndpoint} --insecure --kafka-traffic-brokers ${brokerEndpoints} --kafka-traffic-topic logging-traffic-topic --kafka-traffic-group-id ${groupId}`
let command = `/runJavaWithClasspath.sh org.opensearch.migrations.replay.TrafficReplayer ${osClusterEndpoint}`
const extraArgsDict = parseArgsToDict(props.extraArgs)
command = appendArgIfNotInExtraArgs(command, extraArgsDict, "--insecure")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like that's the way to go - we exclude kafkaConnection in a similar manner in the code above.

Comment on lines +94 to +96
if (props.clusterAuthDetails.basicAuth.password) {
console.warn("Password passed in plain text, this is insecure and will leave" +
"your password exposed.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we support this before? I don't see it in the old patch & don't want to add it shortly before we cut a major release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's two places today we create plaintext secrets, as well as RFS supports it directly as an inline arg.

The experience without this PR/fix is that we concat the argument --auth-header-user-and-secret username undefined and the replayer fails on startup.

If you'd like instead we can throw an error here saying "Plaintext password in cdk.context.json with replayer is not supported" and have it be a build time exception.

https://github.com/search?q=repo%3Aopensearch-project%2Fopensearch-migrations+new+Secret%28this&type=code

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a jira to stop accepting plaintext passwords and to stop accepting them on the command line. With that, I'm fine moving forward. As it is, I consider these major security risks and we shouldn't let the user store secrets in so many places unencrypted.

As long as the secret is in a task definition, I suppose it doesn't matter if we put it here too - we just need to reverse this too when we lock down the other code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Andre Kurait <akurait@amazon.com>
Copy link
Collaborator

@gregschohn gregschohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you get a chance, can you please unresolve the comments that require future work (I've permalinked them from the ticket, so I don't think it's a big deal either way).

@AndreKurait AndreKurait merged commit 00f7ce3 into opensearch-project:main Sep 27, 2024
14 checks passed
@AndreKurait AndreKurait deleted the FixSetHeaderExtraArgs branch September 27, 2024 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants