Skip to content
This repository has been archived by the owner on Dec 13, 2023. It is now read-only.

Fix rerunWorkflow places synchronous system tasks in the queue #2494

Merged
merged 3 commits into from
Oct 6, 2021

Conversation

jxu-nflx
Copy link
Contributor

Pull Request type

  • Bugfix
  • Feature
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Other (please describe):

Changes in this PR

When rerun from task, if it's a synchronous system task, don't put it in the queue, start directly instead.

Describe the new behavior from this PR, and why it's needed
Issue #MWI-3456: rerunWorkflow places synchronous system tasks in the queue

Alternatives considered

Describe alternative implementation you have considered

@jxu-nflx jxu-nflx requested a review from aravindanr September 28, 2021 22:28
@github-actions
Copy link
Contributor

github-actions bot commented Sep 28, 2021

Unit Test Results

   154 files  ±0     154 suites  ±0   8m 17s ⏱️ ±0s
1 132 tests ±0  1 132 ✔️ ±0  0 💤 ±0  0 ±0 

Results for commit ef6b65d. ± Comparison against base commit ef6b65d.

♻️ This comment has been updated with latest results.

@@ -58,7 +59,7 @@ public JsonJqTransform(ObjectMapper objectMapper) {
@Override
public void start(Workflow workflow, Task task, WorkflowExecutor executor) {
final Map<String, Object> taskInput = task.getInputData();
final Map<String, Object> taskOutput = task.getOutputData();
final Map<String, Object> taskOutput = task.getOutputData() != null ? task.getOutputData(): new HashMap<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOutputData() does not return null. Can you please explain the need for a null check here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outputData field is reset to null here: https://github.com/Netflix/conductor/blob/main/core/src/main/java/com/netflix/conductor/core/execution/WorkflowExecutor.java#L1642 I guess we could update that line to just clear the map instead of setting it null, but feel there might be other cases that I might neglected if I changed that. So I added the null check for outputData in the system task instead. And that's the only sync system task that need to add the null check

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to clear the outputData or set it to an empty HashMap during rerun. If you look at the Task, the outputData field is defaulted to an empty map. So it's natural for the rest of the code to expect a non-null value from Task.getOutputData(). DoWhile also expects a non-null output data and would fail if null is returned.

Also, the fewer nulls we introduce, the less chance of an unexpected NullPointerException.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya updated it to clear the outputData instead of setting to null

@@ -1324,6 +1340,59 @@ public void testRerunWorkflowWithTaskId() {
assertEquals(new HashSet<>(), workflow.getFailedReferenceTaskNames());
}

@Test
public void testRerunWorkflowWithSyncSystemTaskId() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

tasks.size() == 1
tasks[0].status == Task.Status.FAILED
tasks[0].taskType == 'JSON_JQ_TRANSFORM'
tasks[0].reasonForIncompletion as String == "Cannot index string with string \"array\""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as String may not be necessary. Also, using single quotes ' ensures that Groovy uses a String and not a GString.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated!

tasks.size() == 1
tasks[0].status == Task.Status.COMPLETED
tasks[0].taskType == 'JSON_JQ_TRANSFORM'
tasks[0].outputData as String == "[result:[out:[a, b, c, d]], resultList:[[out:[a, b, c, d]]]]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated as well, the as String is necessary for outputData though as it's a map, so kept it, but used single quotes for the match string instead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that it's a map here. In that case, it makes sense to assert the keys of the map rather than converting it to s String and asserting that. The assertion will fail, if the toString() logic in Map changes, its unlikely, but possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can rely on Java's implementation of toSTring() of the Map. It's kind of cumbersome to write a loop through the map to assert each key/value pair. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifying the contents of a Map using its toString() is not ideal. It's also harder to understand and refactor. The goal of the test is not to verify if jackson-jq is doing its work, so you could

  1. write a simpler jq expression that is easy to verify or
  2. remove the outputData assertion altogether or
  3. assert that output.result != null and output.resultList != null

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, didn't realize the outputData result is JsonNode. yea, the purpose of the test isn't for those kind of checks. Refactored to only do basic assertions.

- clear task output on rerun instead of set it to null
- refactor outputData checks for JsonJQTransformSpec
@aravindanr aravindanr merged commit ef6b65d into main Oct 6, 2021
@aravindanr aravindanr deleted the jxu-fix-rerunWithSynSystemTask branch October 6, 2021 01:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants