Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: rework package to facilitate worker agent integ tests #6

Merged
merged 5 commits into from
Sep 5, 2023

Conversation

jericht
Copy link
Contributor

@jericht jericht commented Aug 31, 2023

What was the problem/requirement? (What/Why)

While working on writing integration tests for deadline-cloud-worker-agent, I realized there could be some improvements to our test library to make it easier to write tests

What was the solution? (How)

  • Rework the entire package. See detailed changes below.
  • Also fixed the add-copyright-headers.sh script and made the copyright header test run by default

What is the impact of this change?

It's a little easier to use this package for integration tests (specifically for Worker agent)

How was this change tested?

Was this change documented?

Yes, added to README.md and an example configuration file

Is this a breaking change?

Yes

Changes

  • cf_templates/: Replaced by code in cloudformation/
  • cloudformation/: Models CloudFormation templates and handles deployment
    • cfn.py: Core model for CloudFormation resources
    • job_attachments_bootstrap_stack.py: Replacement for old CFN template under cf_templates/
    • worker_bootstrap_stack.py: Stack deploying bootstrap resources for Worker agent deployment
  • deadline/: Models Deadline Cloud resources. Currently, this includes:
    • client.py: The existing shim layer we had for the boto client to handle API changes
    • stubs.py: The existing stub for the boto client that we use in some unit tests
    • resources.py: Models Deadline Cloud resources (e.g. Farm, Queue, Fleet, Job, etc.)
    • worker.py: Handles deploying a Worker agent.
      • There are now two types of Worker deployments:
        • The EC2 implementation is similar to our original fixture but has been changed to use the Worker agent installer to do most of the configuration
        • Added a Docker container implementation so integ tests can be run on a local Docker container
          • Note that this mode does not support installing the systemd service, because that's not recommended in Docker containers. See Worker agent PR for addition of --no-install-service option to its installer
      • Added ability to copy over files from host machine to Worker agent environment (EC2 instance or Docker container). Currently, this provides a generic way to facilitate:
        • Using a dev build of the Worker agent
        • Using a local Deadline Cloud model file
  • fixtures.py: The number of fixtures has been reduced since the functionality has moved into objects that are returned by the fixtures. Currently, we have:
    • deadline_client: Our shim layer boto client that is configurable with DEADLINE_ENDPOINT env var
    • codeartifact: Stores info about the CodeArtifact repo to pull Python deps from based on env vars. Currently only used for Worker.
    • service_model_s3_object: Allows specifying the model file to use which is stored in S3 via DEADLINE_SERVICE_MODEL_S3_URI. Currently only used for Worker.
    • bootstrap_resources: Deploys the CloudFormation stack for bootstrapping Worker resources. Only used for Worker.
    • deadline_resources: Deploys an opinionated set of Deadline Cloud resources geared towards the Worker.
    • worker: Deploys a Worker.
    • deploy_job_attachment_resources: The existing fixture used by Job Attachments integ tests
  • job_attachment_manager: Changed to use new deadline/ and cloudformation/ code.
    • I tried to keep the public interface as similar as I could to the existing interface to make it easy to upgrade the Job Attachments test to this new version. Hopefully it's just a few renames
  • models.py: Contains common dataclasses. Open to suggestions for the module name.
  • pytest_hooks.py: Defines pytest hooks to enhance live logging for integ tests
  • deadline_manager.py: Replaced by code in deadline/
  • constants.py:
    • Constants were moved closer to where they are needed and removed the need to share them
    • Env var configurations were moved to fixtures.py
  • util.py: Changed to contain utility functions used across the package
  • test/unit: Updated/added unit tests. Added moto as a test dependency to make testing easier.
    • Given the size of this PR, I recommend quickly glazing over the unit tests

@jericht jericht marked this pull request as ready for review August 31, 2023 04:09
@jericht jericht requested a review from a team as a code owner August 31, 2023 04:09
@marofke marofke self-requested a review August 31, 2023 15:16
@moorec-aws moorec-aws self-requested a review August 31, 2023 15:58
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to comment on some thoughts/ideas that I had while modifying these fixtures.

Currently, these fixtures are highly opinionated towards Worker agent integration tests. e.g. the deadline_resources fixture always deploys a Farm + Queue + Fleet setup. While we could reuse this fixture for JobAttachments integ tests, it doesn't feel right to be deploying a Fleet + QFA when we don't need it for those tests.

Similarly, to maintain backwards compatibility with JA integ tests, I've kept the existing deploy_job_attachment_resources fixture and the related JobAttachmentManager class. These are also specifically for Job Attachments.

So.. that got me wondering if we actually want all these fixtures to be defined in this package. Currently, basically all fixtures aside from deploy_job_attachment_resources will only be used by Worker tests, while deploy_job_attachment_resources will only be used by Job Attachments. Maybe it makes more sense to move these fixtures into their respective packages instead, because they are so specific.

The fixtures that I think do make sense to keep here need to be common/generic enough that multiple packages can reuse them in their own fixture implementations.

e.g. when we get around to adding deadline-cloud integration tests, it probably makes sense to provide a default Farm and Queue fixture so we can test out job submission.. and building on top of that, the worker fixture might also make sense to keep in here for when we eventually do full E2E tests in the future (DCC -> deadline-cloud CLI -> service -> Worker agent),

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, we can keep it as is since it's pretty early in the game to be making long-term organizational decisions like this, but just wanted to mention it

Copy link
Contributor

@ddneilson ddneilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow. This is a lot of code, and very well done. You've basically created our own CDK-lite

It definitely needs more commenting, and a DEVELOPER.md to help navigate it. We can do those as fast-follows.

@@ -1,30 +1,101 @@
## Deadline Test Scaffolding
# Deadline Cloud Test Fixtures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For folk to be able to use this it would be ideal to have the minimal IAM Policy that one needs to use the fixtures (i.e. deploy the Cfn stacks) in the README.

Figuring that out can be some serious whack-a-mole, so let's not hold up this PR to gather that, but we will need to figure out that minimal policy when integrating this in to our integration test runners.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I've got a backlog ticket to address this

self.worker_role = Role(
self,
"WorkerRole",
role_name="DeadlineScaffoldingWorkerRole",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A hard-coded name like this will mean that an account can only ever have a single instance of the integration test scaffold running/deployed at a time. What do you think about adding a random or time element to the name to make it possible to have multiple going at once?

Ditto for other roles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. In it's current state, this infrastructure is only designed to be deployed once across all tests (the CfnStack.deploy method will try to update an existing stack name, and our fixtures are session scoped). I think for now, this is sufficient since it should cover most of our test cases for the Worker agent.

For the test cases where we need a misconfigured AWS resource like an IAM role, which would be the value of having multiple stack deployments, we can either define another stack with the misconfigured resources, or change this one to be configurable.

I've posted a comment that's somewhat related to this: #6 (comment). I think that when we do need the misconfigured AWS resources stack, it should belong in the Worker agent repo, since other repos most likely don't care about a Worker that doesn't work

"Effect": "Allow",
"Action": [
"deadline:CreateWorker",
"deadline:GetWorkerIamCredentials",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetWorkerIamCredentials does not exist. This can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed

src/deadline_test_fixtures/example_config.sh Show resolved Hide resolved
src/deadline_test_fixtures/fixtures.py Show resolved Hide resolved
@jericht jericht merged commit b50b7d2 into mainline Sep 5, 2023
5 checks passed
@moorec-aws moorec-aws deleted the jericht/rework branch June 3, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants