Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task param space for files in a folder #48

Open
ddneilson opened this issue Sep 19, 2024 · 1 comment
Open

Task param space for files in a folder #48

ddneilson opened this issue Sep 19, 2024 · 1 comment

Comments

@ddneilson
Copy link
Contributor

Copied over from aws-deadline/deadline-cloud#447
Original submitter: https://github.com/Mathieson


I need my task space to be a subset of strings in a larger string array for my project. This is not currently supported, but I've found a solution that I am pretty happy with, and more official support to streamline it would be fantastic. Here is what I've done.

I've created a folder dedicated to hosting many text files. Each text file contains a slice of the list of strings I want processed. For example...

  • my_job_bundle
    • my_task_values
      • 0.txt
      • 1.txt
      • 2.txt

In my template.yaml, I've defined a top-level param to specify the task_count, which will determine how many of these files I will process.

  - name: MyTaskCount
    type: INT

My parameter space defines a simple range...

    parameterSpace:
      taskParameterDefinitions:
        - name: TaskFileIndex
          type: INT
          range: "1-{{Param.MyTaskCount}}"

Then, in my script, I use Task.Param.TaskFileIndex to pull the appropriate file from that directory for the task, read its contents, and continue processing as normal.

I currently use Deadline Cloud with the Python submitter, so this workflow allows me to generate those task files dynamically in my submission script. This is extra awesome because it means I can enlist the help of more-itertools to create my ranges incredibly easily. As for local OpenJD workflow, it is also easy to generate and keep some of these files in my working area.

My hope is that this approach will become streamlined through official support, and we might be able to specify something such as...

    parameterSpace:
      taskParameterDefinitions:
        - name: TaskFile
          type: DirectoryContents
          range: "{{Param.MyTaskDirectory}}"

It would give you a path for each file found within the specified directory, or even...

    parameterSpace:
      taskParameterDefinitions:
        - name: Whatever
          type: TaskFileContents
        - name: Another
          type: TaskFileContents
      combination: "({Task.Param.Whatever},{Task.Param.Another})"

It could go so far as to give you the values found within said file and implicitly determine the file via a predictable folder structure matching up with the step name and taskParameter name. For example...

  • my_job_bundle
    • task_files
      • step_name
        • Whatever
          • 0.txt
          • 1.txt
          • 2.txt
        • Another
          • 0.txt
          • 1.txt
          • 2.txt

Interested in hearing others' thoughts. Thanks!

@ddneilson
Copy link
Contributor Author

This is a cool idea

It ties in with a "meta template" idea that I've been pondering. A constraint with the actual job template is that there can't be anything in it that defines job structure based on the contents of something local to the submitting workstation. Reason being that the job template is the thing that you submit to a render management system, and it's the backend server of that system that needs to process the template to determine job structure; that backend system doesn't have access to the submitting workstation's local environment.

So, the idea of a "meta template" is something that looks like a job template but can have directives in it to pull data from local sources. Then it needs to be "baked" into a job template to be submitted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant