Implement extra resources for test actions #13996

scele · 2021-09-15T09:49:41Z

Add support for user-specified resource types in the resource manager.
This generalizes the CPU, RAM and "test count" resource support for
other resource types such as the number of GPUs, available GPU memory,
the number of embedded devices connected to the host, etc.

The available amount of extra resources can be specified using the new
--local_extra_resources=<resourcename>=<amount> command line flag, which
is analoguous to the existing --local_cpu_resources and
--local_memory_resources flags.

Tests can then declare the amount of extra resources they need by using
a resources:<resourcename>:<amount> tag.

scele · 2021-09-15T09:49:51Z

Picking up #11963.

wilwell

Generally code looks good!
But I mentioned some changes about constructors and namings.

wilwell · 2021-09-16T12:02:02Z

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

+  /** How many extra resources an action requires for execution. */
+  public static final ParseableRequirement RESOURCES =
+      ParseableRequirement.create(
+          "resources:<str>:<int>",


Why we are using int instead of float? On ResourceSet we are using ImmutableMap<String, Float> extraResourceUsage for this purposes. Also resources could be float, e.g. cpu

This is just to match the cpu:<int> code above. Looks like there is some inconsistency around what can be a float and what can be an int:

Execution requirements: src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java parses int.

Test tags: src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java parses Float.

Command line flags: src/main/java/com/google/devtools/build/lib/util/ResourceConverter.java parses int or a Float in the expression multiplier:

/** Description of the accepted inputs to the converter. */ public static final String FLAG_SYNTAX = "an integer, or a keyword (\"auto\", \"HOST_CPUS\", \"HOST_RAM\"), optionally followed by " + "an operation ([-|*]<float>) eg. \"auto\", \"HOST_CPUS*.5\"";

In its current form, this change is matching the existing code: integers in execution requirements, floats in test tags, integer on the command line, and floats in the resource manager code. But I'm open to suggestions if we want to use slightly different conventions than what is used for CPU resources?

floats feel like they'd be more flexible, I can make the change here in #16785.

wilwell · 2021-09-16T12:03:08Z

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

+            }
+
+            if (value < 1) {
+              return "can't be zero or negative";


Why some of values could not be zero?

This is also just to match the cpu:<int> parsing logic above.

Unlike with CPUs (that are required to orchestrate tests) it's likely GPU or embedded target based tests will be in test suites with unit tests that don't require newly defined resources. This should be changed to accept values of 0.

Yeah, I agree that we could accept zero values

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

wilwell · 2021-09-16T12:06:31Z

src/main/java/com/google/devtools/build/lib/actions/ResourceSet.java

@@ -43,12 +45,24 @@
  /** The number of CPUs, or fractions thereof. */
  private final double cpuUsage;

+  /** Map of extra resources mapping name of the resource to a value. */
+  private final ImmutableMap<String, Float> extraResourceUsage;


Could you please mention what type of resources it could be?

wilwell · 2021-09-16T12:12:19Z

src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java

-  private static final ResourceSet MEDIUM_RESOURCES = ResourceSet.create(100, 1, 1);
-  private static final ResourceSet LARGE_RESOURCES = ResourceSet.create(300, 1, 1);
-  private static final ResourceSet ENORMOUS_RESOURCES = ResourceSet.create(800, 1, 1);
+  private static final ResourceSet SMALL_RESOURCES = ResourceSet.create(20, 1, null, 1);


We could ignore null field here. We have the constructor with only 3 fields

wilwell · 2021-09-16T12:16:37Z

src/main/java/com/google/devtools/build/lib/actions/ResourceManager.java

@@ -422,6 +477,6 @@ synchronized int getWaitCount() {

  @VisibleForTesting
  synchronized boolean isAvailable(double ram, double cpu, int localTestCount) {
-    return areResourcesAvailable(ResourceSet.create(ram, cpu, localTestCount));
+    return areResourcesAvailable(ResourceSet.create(ram, cpu, null, localTestCount));


You could ignore null here

wilwell · 2021-09-20T13:14:15Z

src/main/java/com/google/devtools/build/lib/actions/ResourceSet.java

+    this(memoryMb, cpuUsage, null, localTestCount);
+  }
+
+  private ResourceSet(double memoryMb, double cpuUsage, ImmutableMap<String, Float> extraResourceUsage, int localTestCount) {
    this.memoryMb = memoryMb;
    this.cpuUsage = cpuUsage;
+    if (extraResourceUsage == null) {
+        this.extraResourceUsage = ImmutableMap.of();
+    } else {
+        this.extraResourceUsage = extraResourceUsage;
+    }


For me this checks on null looks overcomplicated and it's better to avoid nulls in code. So I suggest mark extraResourceUsage in second constructor as @NonNull and use ImmutableMap.of() in the first constructor.
Also it's better to use 3 param constructor on other code where it's possible.

That seems to work with a couple of caveats:

I had to add new dependencies in

Using @NonNull decorator in ResourceSet.create resulted in //src/test/shell/bazel:bazel_bootstrap_distfile_tar_test test failures with error: No generator for: (@io.reactivex.rxjava3.annotations.NonNull :: com.google.common.collect.ImmutableMap<java.lang.String,java.lang.Float>) (see CI job here). I removed @NonNull from that function and only kept in the constructor - do you think that's OK?

wilwell · 2021-09-20T13:14:43Z

src/main/java/com/google/devtools/build/lib/actions/ResourceSet.java

-      double memoryMb, double cpuUsage, int localTestCount) {
-    if (memoryMb == 0 && cpuUsage == 0 && localTestCount == 0) {
+      double memoryMb, double cpuUsage, ImmutableMap<String, Float> extraResourceUsage, int localTestCount) {
+    if (memoryMb == 0 && cpuUsage == 0 && (extraResourceUsage == null || extraResourceUsage.size() == 0) && localTestCount == 0) {


Remove check on null.

wilwell · 2021-09-20T13:17:22Z

src/main/java/com/google/devtools/build/lib/exec/ExecutionOptions.java

+      help =
+          "Set the number of extra resources available to Bazel. "
+            + "Takes in a string-float pair. Can be used multiple times to specify multiple "
+            + "types of extra resources. Bazel will limit concurrently running test actions "


You mentioned test actions here. If this param only for tests, then it's better to add test in its name. E.g. local_test_extra_resources. This is also applies to name of this field in classes

Hmm, I guess the implementation should not be restricted to tests, although I have not verified that it works for non-test actions as well. Let me investigate this a bit more. Would be nice to match the behavior and naming of --local_cpu_resources and --local_ram_resources which also apply to both test and build actions?

Yeah, this resources are implied not only for test actions. Could you please then remove test from description or use text about tests as an example.

wilwell · 2021-09-20T13:20:17Z

src/main/java/com/google/devtools/build/lib/actions/ResourceManager.java

+    for (Map.Entry<String, Float> resource : resources.getExtraResourceUsage().entrySet()) {
+      String key = (String)resource.getKey();
+      float used = (float)usedExtraResources.getOrDefault(key, 0f);
+      float requested = resource.getValue();
+      float available = (float)availableResources.getExtraResourceUsage().getOrDefault(key, 0f);
+      float epsilon = 0.0001f;  // Account for possible rounding errors.
+      if (requested != 0.0 && used != 0.0 && requested + used > available + epsilon) {
+        extraResourcesIsAvailable = false;
+        break;
+      }
+    }


Maybe it's better to move this code in function to make code more readable.

scele

Thanks for the review @wilwell! Replied to some of the comments, will address the rest of them in the next patch I push.

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

scele · 2021-09-21T10:33:38Z

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

+  /** How many extra resources an action requires for execution. */
+  public static final ParseableRequirement RESOURCES =
+      ParseableRequirement.create(
+          "resources:<str>:<int>",


This is just to match the cpu:<int> code above. Looks like there is some inconsistency around what can be a float and what can be an int:

Execution requirements: src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java parses int.

Test tags: src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java parses Float.

Command line flags: src/main/java/com/google/devtools/build/lib/util/ResourceConverter.java parses int or a Float in the expression multiplier:

/** Description of the accepted inputs to the converter. */ public static final String FLAG_SYNTAX = "an integer, or a keyword (\"auto\", \"HOST_CPUS\", \"HOST_RAM\"), optionally followed by " + "an operation ([-|*]<float>) eg. \"auto\", \"HOST_CPUS*.5\"";

In its current form, this change is matching the existing code: integers in execution requirements, floats in test tags, integer on the command line, and floats in the resource manager code. But I'm open to suggestions if we want to use slightly different conventions than what is used for CPU resources?

scele · 2021-09-21T10:34:48Z

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

+            }
+
+            if (value < 1) {
+              return "can't be zero or negative";


This is also just to match the cpu:<int> parsing logic above.

scele · 2021-09-21T10:38:01Z

src/main/java/com/google/devtools/build/lib/exec/ExecutionOptions.java

+      help =
+          "Set the number of extra resources available to Bazel. "
+            + "Takes in a string-float pair. Can be used multiple times to specify multiple "
+            + "types of extra resources. Bazel will limit concurrently running test actions "


Hmm, I guess the implementation should not be restricted to tests, although I have not verified that it works for non-test actions as well. Let me investigate this a bit more. Would be nice to match the behavior and naming of --local_cpu_resources and --local_ram_resources which also apply to both test and build actions?

Add support for user-specified resource types in the resource manager. This generalizes the CPU, RAM and "test count" resource support for other resource types such as the number of GPUs, available GPU memory, the number of embedded devices connected to the host, etc. The available amount of extra resources can be specified using the new --local_extra_resources=<resourcename>=<amount> command line flag, which is analoguous to the existing --local_cpu_resources and --local_memory_resources flags. Tests can then declare the amount of extra resources they need by using a "resources:<resourcename>:<amount>" tag.

joeljeske · 2022-03-14T14:32:56Z

@scele thanks for your effort in getting this ready! I would be very excited to see this land; I see a lot of broad value in having full support for declaring resources needed for tests.

Do you plan to pick this back up and greening it up?

wilwell

Please sync the changes from the repo. These files have a lot of changes during last month.

wilwell · 2022-06-28T11:26:42Z

src/main/java/com/google/devtools/build/lib/actions/ResourceManager.java

@@ -137,6 +141,10 @@ public static ResourceManager instance() {
  // definition in the ResourceSet class.
  private double usedRam;

+  // Used amount of extra resources. Corresponds to the extra resource
+  // definition in the ResourceSet class.
+  private Map<String, Float> usedExtraResources;


Could you please sync the changes from the repository. This file had significant changes during last month.

wilwell · 2022-06-28T11:30:35Z

src/main/java/com/google/devtools/build/lib/exec/ExecutionOptions.java

+      help =
+          "Set the number of extra resources available to Bazel. "
+            + "Takes in a string-float pair. Can be used multiple times to specify multiple "
+            + "types of extra resources. Bazel will limit concurrently running test actions "


Yeah, this resources are implied not only for test actions. Could you please then remove test from description or use text about tests as an example.

wilwell · 2022-06-28T11:31:56Z

src/main/java/com/google/devtools/build/lib/actions/BUILD

@@ -124,6 +124,7 @@ java_library(
        "//third_party:flogger",
        "//third_party:guava",
        "//third_party:jsr305",
+        "//third_party:rxjava3",


Why do we need this library?

wilwell · 2022-06-28T11:34:07Z

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

+            }
+
+            if (value < 1) {
+              return "can't be zero or negative";


Yeah, I agree that we could accept zero values

src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java

wilwell · 2022-06-28T11:38:20Z

src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java

+    return ImmutableMap.copyOf(extraResources);
+  }
+
+  public ResourceSet getLocalResourceUsage(Label label, boolean usingLocalTestJobs)


Could you please reaarange the methods to make the diff between two version more clear?

sgowroji · 2022-07-04T06:25:43Z

Hello @scele, Can you please resolve conflicts and above comments for the requested changes. Thank you!

sgowroji · 2022-07-11T09:28:52Z

We are marking the above PR as stale because it has not had any recent activity from many days. It will be closed in 7 days if there is no further activity occurs. Thank you.

scele · 2022-07-15T06:10:35Z

Yeah, sorry, I'm on vacation now and can't really commit to any finite date to push this to the finish line. So closing this PR for now.

Addressed comments on bazelbuild#13996 Signed-off-by: Drew Macrae <drewmacrae@google.com>

Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <drewmacrae@google.com>

This recreates a [closed PR](#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:#16817 Closes #16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68

This recreates a [closed PR](bazelbuild#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:bazelbuild#16817 Closes bazelbuild#16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68

This recreates a [closed PR](#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:#16817 Closes #16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68 Co-authored-by: kshyanashree <109167932+kshyanashree@users.noreply.github.com>

This recreates a [closed PR](#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:#16817 Closes #16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68

google-cla bot added the cla: yes label Sep 15, 2021

meisterT requested a review from wilwell September 15, 2021 13:46

wilwell suggested changes Sep 20, 2021

View reviewed changes

scele commented Sep 21, 2021

View reviewed changes

scele force-pushed the extra_resources branch 2 times, most recently from 736c7a1 to 4478fca Compare September 22, 2021 13:41

scele force-pushed the extra_resources branch from 4478fca to 195fcd2 Compare September 22, 2021 14:37

philwo force-pushed the master branch from 26cb401 to 168b89b Compare December 2, 2021 18:07

sgowroji mentioned this pull request Apr 22, 2022

Implement extra resources for test actions #11963

Closed

wilwell suggested changes Jun 28, 2022

View reviewed changes

sgowroji added team-Local-Exec Issues and PRs for the Execution (Local) team awaiting-review PR is awaiting review from an assigned reviewer awaiting-user-response Awaiting a response from the author and removed awaiting-review PR is awaiting review from an assigned reviewer labels Jun 28, 2022

scele closed this Jul 15, 2022

drewmacrae mentioned this pull request Nov 17, 2022

Extra resources #16785

Closed

drewmacrae pushed a commit to drewmacrae/bazel that referenced this pull request Nov 17, 2022

fixup Implement extra resources for test actions

7ab9df6

Addressed comments on bazelbuild#13996 Signed-off-by: Drew Macrae <drewmacrae@google.com>

drewmacrae mentioned this pull request Jan 17, 2023

Extra resources #17229

Merged

Implement extra resources for test actions #13996

Implement extra resources for test actions #13996

Conversation

scele commented Sep 15, 2021 • edited Loading

scele commented Sep 15, 2021

wilwell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drewmacrae Jun 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scele left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joeljeske commented Mar 14, 2022

wilwell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sgowroji commented Jul 4, 2022

sgowroji commented Jul 11, 2022

scele commented Jul 15, 2022

scele commented Sep 15, 2021 •

edited

Loading

drewmacrae Jun 21, 2022 •

edited

Loading