Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer Storage Classes by "specializing" SPIR-V modules "generic" over them. #414

Merged
merged 16 commits into from
Feb 22, 2021

Conversation

eddyb
Copy link
Contributor

@eddyb eddyb commented Feb 10, 2021

How It Works

  • OpTypePointers are always emitted by rustc_codegen_spirv with a placeholder Storage Class
    • the SPIR-V modules don't have to/can't pass validation with those placeholders in, but it has to be one of the existing Storage Classes, because we use the SPIR-V binary representation on-diisk and rspirv::dr in-memory
    • for now I've chosen Generic, which may be confusing, and also it has different semantics than what we're doing, but as long as we don't need the real Generic Storage Class, we're good
      • to alleviate some of that confusion, I'll use Placeholder below (maybe Unknown would be better?)
    • this applies to the pointer types of interface (Input/Output/PushConstant, etc.) variables used by entry-points: thankfully, OpVariable has a redundant Storage Class, so we can set that to the correct value, but keep the pointer type Generic
    • similarly, spirv_std::storage_class::* types have no effect anymore on SPIR-V types, they're only used to pick the (Input/Output/PushConstant, etc.) Storage Class of interface OpVariables
      • Investigate inferring storage class #300 has more drastic changes, but I've avoided doing that at the same time as implementing the rest of this PR
      • at this point they could all just implement Deref and be indistinguishable from plain references!
  • any top-level (module-scoped) SPIR-V instruction that mentions the placeholder Storage Class is considered "generic" over it
    • e.g. %13 = OpTypePointer Placeholder %12 is treated as if it were %13<$0> = OpTypePointer $0 %12, where $0 denotes the first (and only, in this example) generic parameter of the instruction
    • this is propagated by making other instructions using "generic" ones, "generic" themselves in turn (as much as possible)
      • e.g. %14 = OpTypeStruct %13 %13 %13 becomes %14<$0, $1, $2> = OpTypeStruct %13<$0> %13<$1> %13<$2> - this would come from something like (&T, &T, &T): same pointee type, but other than that the pointers are independent
    • this includes OpFunctions, but they are exactly as "generic" as their OpTypeFunction (i.e. signature), ignoring the body (see below for what role instructions in the function body, eventually play)
  • inference is performed over OpConstant*s, global OpVariables, and function bodies
    • the rules for what types (or Storage Classes) involved in an instruction must be equal, are provided by spirv_type_constraints, which was introduced in inline asm!: support writing _ in lieu of return types, for basic inference. #376, and augmented in the first 3 commits of this PR
    • Storage Class "inference variables" are generated wherever a generic definition is used, and each function has a single pool of inference variables that can all interact with eachother (across all of the instructions in the function's body)
    • these inference variables are tracked using simple union-find (this is pretty standard for Hindley–Milner inference and friends - I could've used rustc's own ena library, but we don't need its features, so it's just extra complexity)
    • example of inference taking place for one instruction in a function body (taken from a sky-shader's entrypoint):
       -> %721 = OpFunctionCall %6 %714<?0, ?1> %717<?2 = Input> %718<?3 = Output>
          found Match [%6 = %6, [%138<?0>, %76<?1>] = [%138<?2 = Input>, %76<?3 = Output>]]
       <- %721 = OpFunctionCall %6 %714<?0 = Input, ?1 = Output> %717<?2 = Input> %718<?3 = Output>
      • ?0, ?1, ?2, ?3 are inference variables, created because of how generic the OpFunctionCall arguments are
      • -> is before, and <- is after, inference - the reason ?2 and ?3 are known right away is because %717 and %718 are (interface) global OpVariables that have already been inferred to always need Input, and Output, respectively
      • the found Match [...] line shows the type/Storage Class equalities coming from applying the spirv_type_constraints patterns, that will then be acted upon to either make some inference variables "aliases" of others (via union-find), or outright assign a known Storage Class to them
      • after inference is performed on the call instruction, you can see ?0 and ?1 are now also known, but more importantly, %714 (the callee) now has concrete (Storage Class) "generic arguments" for its "generic parameters"
    • inference results are used to relate the function body to the "generic parameters" of the global/function, but also constrain the parameters themselves - such as a function that always takes/returns a pointer with an Input Storage Class
    • on function bodies, the inference follows the function call graph, inferring callees before callers, in order to propagate constraints "upwards" on it (i.e. towards the entry-points, which are effectively the only "roots" of the call graph)
      • currently recursion isn't fully accounted for (so inference may not be complete), but supporting it explicitly would only be a matter of repeating inference on the callgraph until fixpoint (i.e. when new constraints on parameters are no longer found) - the main reason I haven't done so already is that doing so naively is wasteful (and I have no testcases)
  • after everything in the module has been parameterized and inferred, we can collect all necessary concrete instances
    • that is, all "generic arguments" used for every "generic" definition - like "monomorphization" in C++ or Rust
    • in my first example that might be %13<Input> or %14<Function, PushConstant, Function>, and in the realistic (sky-shader) example, we have %714<Input, Output> (as well as %717<Input> and %718<Output>)
    • every instance of a "generic" function is in turn scanned for instances of other functions/constants/global variables it needs, until all the necessary instances for the whole module (starting from the entry-points) have been found
  • the last step is to "expand" all "generic" globals and functions to one copy per needed concrete instance
    • what to tweak in every copy (in order to specialize it to a specific instance) is recorded by the earlier inference step in a set of Replacements (i.e. operand positions and what to replace the operand at that position with)
    • to continue with the first example, it could result in something like this:
      ; %13<Function>
      %1001 = OpTypePointer Function %12
      ; %13<PushConstant>
      %1002 = OpTypePointer PushConstant %12
      ; %14<Function, PushConstant, Function>
      %1003 = OpTypeStruct %1001 %1002 %1001

Pre-Merge Checklist

  • renumber IDs inside functions

  • look for and/or address remaining // TODO(eddyb) comments in the code

  • implement matching/inference support for OpAccessChain & friends (using TyPat::IndexComposite)

    • example of it in action (note %2717<?1, ?2>.0 = %2544<?6, ?7> which results in ?6 = ?1 and ?7 = ?2)
        %2722 = OpFunctionParameter %2718<?0, ?1, ?2>
         -> %3051 = OpAccessChain %2545<?5, ?6, ?7> %2722 %325
            found Match [?0 = ?5, %2717<?1, ?2>, %2717<?1, ?2>.0 = %2544<?6, ?7>]
         <- %3051 = OpAccessChain %2545<?5 = ?0, ?6 = ?1, ?7 = ?2> %2722 %325
  • take advantage of Add a % for Operand::IdRef Display impl gfx-rs/rspirv#184 to simplify some debug logging

  • add Deref impls to spirv_std::storage_class::* or replace them entirely (see Investigate inferring storage class #300)

  • test the rust-gpu-shadertoys examples with this PR, as they may stress it further than the in-tree ones, and attempt to remove the need to copy const matrices to local variables (in order to call .transpose())

    • note regarding the struck out part: I forgot we can't actually support constant memory yet, without extra work
  • test with no DCE pass before the specializer pass

    • mouse-shader, with DCE before specializer:
    Pass Debug Release
    (post-Use [profile.release.build-override] to optimize build dependencies. #437)
    link_dce 0.185s 0.009s
    specialize_generic_storage_class 0.111s 0.005s
    link_inline 0.204s 0.007s
    link_dce 0.088s 0.003s
    Total 0.588s 0.024s
    • and without the extra DCE:
    Pass Debug Release
    (post-Use [profile.release.build-override] to optimize build dependencies. #437)
    specialize_generic_storage_class 1.694s 0.086s
    link_inline 0.545s 0.020s
    link_dce 0.164s 0.008s
    Total 2.403s 0.114s
    • it's not too slow in release mode, and it's better for testing to not have the early DCE in there, so I removed it
  • figure out if I want to keep the commit history (and therefore convince someone to not merge this PR with squashing)


Closes #300.

@eddyb
Copy link
Contributor Author

eddyb commented Feb 13, 2021

I think I just realized I'm not doing Result ID renumbering inside expanded functions, a bit surprised that didn't blow up in my face yet - I'm guessing the inliner is resilient to that kind of issue, since it just does its own renumbering.

@XAMPPRocky XAMPPRocky added the s: waiting on author PRs that blocked on the author implementing feedback. label Feb 18, 2021
@eddyb eddyb marked this pull request as ready for review February 20, 2021 03:50
@eddyb eddyb requested review from fu5ha and VZout as code owners February 20, 2021 03:50
@eddyb eddyb marked this pull request as draft February 20, 2021 03:57
@eddyb eddyb marked this pull request as ready for review February 20, 2021 04:03
@eddyb eddyb enabled auto-merge (rebase) February 20, 2021 04:04
@@ -11,6 +11,7 @@ repository = "https://github.com/EmbarkStudios/rust-gpu"

[lib]
crate-type = ["dylib"]
test = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do? there are a few tests in rustc_codegen_spirv/linker/test.rs, does this disable those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I... this is an accident, sorry! I was trying to debug reducing my RLS waiting time (it didn't work sigh) and accidentally included it in a commit apparently (wouldn't have happened if I was still using git gui oh well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
s: waiting on author PRs that blocked on the author implementing feedback.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate inferring storage class
3 participants