Context distiller puts out no output when one resource fails #531

flowerncsu · 2018-04-10T19:45:23Z

Two pinfiles tested:

---
pass_aws:
  topology: topology-0.yml

and

---
pass_aws:
  topology: topology-0.yml
fail_aws:
  topology: fail_aws.yml

topology-0 successfully creates an aws instance. fail_aws is:

---
topology_name: ec2-new
resource_groups:
  - resource_group_name: "aws"
    resource_group_type: "aws"
    resource_definitions:
      - name: "ci-testbox"
        flavor: "m1.small"
        role: "aws_ec2"

        region: "us-east-1"
        image: "not-an-image"
        security_group: "default"
        count: 1

When using the PinFile which includes fail_aws, the linchpin.distilled file is not created, regardless of whether there was a previous linchpin.distilled. When using the PinFile without fail_aws, linchpin.distilled is created if there is not one, and overwritten if there is (which is the desired behavior).

The text was updated successfully, but these errors were encountered:

herlo · 2018-04-10T21:06:05Z

@flowerncsu After our discussion, I think there's a simple solution that could work for this scenario. One caveat being that I don't know how the beaker solution will be affected because of the tooling created to clean up resources if a topology fails.

The solution would be to modify the linchpin.conf and include a flag called something like 'ignore_return_code = True' (It will be set to false by default). This flag would then, for now, only be used with the Context Distiller, and would filter out any RunDB data which has a return_code != 0.

In this way, only successful instances (aws_ec2, os_server, specifically) would be distilled. Additional testing may be required for beaker, for reasons stated above.

What do you think? Would you be willing to test out the beaker bits once it's written and verify that it works the way you need?

flowerncsu · 2018-04-11T13:42:07Z

It seems like getting info for whatever context exists should really be the default behavior. If there's a flag needed, I would think the flag would be in the opposite direction (ie, a flag to not distill output if there are any failures) but I can't think of a reason someone would want to discard potential output just because there are pieces of it not available.

Also, I'd like to discuss more about the beaker issue. I definitely would not have expected that the system would tear down successful builds just because an unsuccessful build happened. If I encountered that behavior, I would very definitely think it was a bug, and if that behavior is needed, I think that should also be a flag that can be set in the configuration, rather than default.

herlo · 2018-04-11T14:34:40Z

@flowerncsu, In terms of the return_code discussion and whether we should output the data no matter if the other targets fail, I don't disagree with you. I do have some concerns on how and where the line is drawn.

One scenario I considered is, what if there are two or more of the same resources being provisioned and one of the two fails? LinchPin currently fails that job, even if one of the two is successful, because it's done within one ansible run.

Another scenario would be to have two different resources in a resource group within a single target. Or even more crazy, two separate resource groups each with a single resource within a single target. In each case, only one resource fails.

If we can address these scenarios by saying that if anything in a individual target fails, the whole target fails, this may be rectified simply enough. If not, and the desire is for any successful resource data within any target to be recorded, it's a bit more challenging and will require more research and time.

For beaker resources, it's very similar to the above issue. I found the issue requesting the fix. It's #394 filed by @jaypoulz. I didn't implement the code to enable the recall of instances, that credit goes to @seandst with #468. It may be useful to point out that testing of these features may reveal something useful, but I have not been able to test more than what is in the current bkr-new.yml topology file.

It may additionally be useful to have a discussion with them to see if there's a way to modify the behavior with options in the topology. I am not sure without some digging whether or not that exists.

I'd like to hear your opinion on this issue.

flowerncsu · 2018-04-11T19:38:24Z

It's sufficient for my current needs to simply include context for any successful target, and ignore (or mark as failed) any failed target. However, it may be worth further discussion to see if mixed-state targets (partially passing) might need to be tracked

Fixes CentOS-PaaS-SIG#531

flowerncsu mentioned this issue Apr 11, 2018

Allow successfully-provisioned resources to stay provisioned even if others of the same type fail #538

Closed

herlo added this to the v1.5.3 milestone Apr 13, 2018

herlo self-assigned this Apr 13, 2018

herlo added the rfe label Apr 13, 2018

herlo pushed a commit to herlo/linchpin that referenced this issue Apr 23, 2018

Added distill_on_error flag

b81a829

Fixes CentOS-PaaS-SIG#531

adl-bot closed this as completed in f5307e8 Apr 23, 2018

adl-bot mentioned this issue Nov 12, 2018

Add integration test to check linchpin.latest output #789

Closed

adl-bot mentioned this issue Oct 31, 2019

Added Azure resource management with API #1411

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context distiller puts out no output when one resource fails #531

Context distiller puts out no output when one resource fails #531

flowerncsu commented Apr 10, 2018

herlo commented Apr 10, 2018

flowerncsu commented Apr 11, 2018

herlo commented Apr 11, 2018

flowerncsu commented Apr 11, 2018

Context distiller puts out no output when one resource fails #531

Context distiller puts out no output when one resource fails #531

Comments

flowerncsu commented Apr 10, 2018

herlo commented Apr 10, 2018

flowerncsu commented Apr 11, 2018

herlo commented Apr 11, 2018

flowerncsu commented Apr 11, 2018