-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emit a warning for floating-point size changes after implicit conversions #6323
Emit a warning for floating-point size changes after implicit conversions #6323
Conversation
@intel/dpcpp-cfe-reviewers Hi, this WIP PR addresses the ask of #5783 I am looking for input for the following cases as well.
I did learn that in OpenCL, double precision is supported by the device, for OpenCL C 3.0 or if the {opencl_c_fp64} feature macro is present. But, It is not clear from the SYCL spec or Clang code base if device code supports variadic calls or builtins. Can either of you or SYCL language design folks please clarify if cases 2 and 3 are valid for device code and if yes, should we diagnose them as well? Thank you! |
@srividya-sundaram I think you will need to support cases 2 and 3 as well. From your PR description it sounds like we want to diagnose double precision arithmetic in device code due to performance issues. I would assume the same performance issues exist for builtins returning double, etc. @premanandrao could you please take a look as well? |
Thanks for the confirmation. I have now included cases for default argument promotion and builtins returning double in device code. I see this new addtion of warning is causing a lot of test failures : https://github.com/intel/llvm/runs/7055091095?check_suite_focus=true Is it ok to fix all those tests? I want to make sure the wordings are okay before I fix them. |
Do we want these warnings generated by default? I think it should be opt-in |
Can you clarify what you mean by opt-in? Also, my question was if we were okay with the exact wordings of the diagnostic text : |
The issue asks for an opt-in warning. From issue description -
Opt-in means you use a flag to enable/disable the warning. By default, the warning should be disabled in this case. Warning should only be generated if corresponding flag is passed.
I commented another possibility in an earlier comment. |
Why do you think they should be opt-in as opposed to opt-out? |
Ah, I saw your later comment quoting the issue description, thanks! I don't understand why they would like that to be opt-in rather than opt-out, but whatever. (My intuition is that users would want to know when their floating-point calculations are going to magically change their precision to be different than what the language would usually provide, but it turns out a diagnostic isn't strictly required for conformance because [basic.types]p12 only requires that double have at least the precision of float. |
Because this doesn't go against the standard as far as I know. Its a performance issue which users may or may not care about. Considering we haven't generated a warning for this up till now and this PR is a result of someone requesting an opt-in warning, I figured it should be opt-in. |
Isn't this the case with the OpenCL flag? For CFE, we are just emitting a warning about performance issues without implicitly casting double-precision floating-point literals to single-precision literals. Even then, wouldn't users not want to know about this potential performance loss in GPU's? |
Ah, I thought we were doing the same thing as OpenCL there, sorry that I was wrong.
I was thinking this was not a performance issue but a correctness issue. But as @elizabethandrews points out, it's only about performance and so off-by-default is somewhat more reasonable. FWIW, I'm skeptical that off-by-default warnings are worth the maintenance burdens. Community experience was that only a very small percentage of people ever opt in to an off-by-default warning (I recall hearing 1-5%, vaguely). So my personal feelings are: if this isn't important enough to warn people about by default, it probably belongs in a separate tool like a profiler, linter, etc. |
I believe the submitter mentioned this can be detected using profiler, but that a warning diagnostic may also be useful. I do not know how to decide whether implementing the warning is worth it or not, since I do not know how community statistics translate over to DPCPP users who care about performance. The issue comments mention a related CUDA flag |
I am not familiar with the usage metrics of opt-in flags in Clang and I understand that the GitHub user only asked for a compile time, opt-in diagnostic and that we dont emit a warning if not explicitly requested by the user. However, I had a few thoughts: In most cases, kernels are intended to be run on GPU's. Case 1: Now chances are that they may never notice this mistake. The device code when run on GPU might have reduced performance. A warning here might give a clue I guess. Case 2: The user might intentionally use double arithmetic in device code but is unaware of this option/flag to invoke. This will also result in reduced performance in GPU. Ofcourse they can debug and eventually pass on the flag but just wondering if we must care about these usecases? |
To add to these great questions and lines of thinking, to me I think it boils down to: how often will the diagnostic fire in a situation where the user goes "oops, this was a mistake". If the answer is "almost always", it should be an on-by-default frontend warning, and if the answer is "rarely" or "uncommonly", it should not be in the frontend at all. My intuition is that people writing code for GPUs will typically expect that the larger the datatype, the worse the performance on the GPU. Thus, I'd expect most folks to find such a warning to be rather low-value. However, I don't know if my intuition matches reality. That said, I think that literals are an interesting case because it is easy to forget one (as @srividya-sundaram pointed out). It might be reasonable to have an on-by-default diagnostic for double literals and maybe a related off-by-default diagnostic for long double literals and direct use of the types. But I'd still wonder how much existing code would trigger the warning and what the false positive rate is for it. |
This seems like a very reasonable suggestion. |
@bader Do you suggest that a warning is given for implicit conversions to a higher precision when this flag is not passed? I imagine it could be useful if it's enabled by a dedicated flag instead of completely by default... |
The original request is to add a compiler option to enable warnings about the use of double precision data types in "device code", which are hard to spot with the human eye - #5783. |
I think the warnings generated for precision change already exist in clang and this PR adds a warning (and a flag to control this warning) when there is a size change but no change in precision, and some logic to handle which warning is fired when there is both a precision change and size change |
Do be honest, I'm not sure if this approach helps with the problem reported in #5783.
According to my understanding there is no precision change or size change here, but I suppose clang will emit double values for this code. Right? The PR title says that clang will emit a warning here because double precision arithmetic is used. |
Here's a summary of what this PR does: Prior to this PR, we had the following 2 warnings for floating-point precision change:
Both the above warnings were turned off by default and only enabled when the corresponding warning flags were passed. Example:
This PR identifies a floating-point size change after an implicit cast and triggers a warning when the Emitting a warning when the size of the target floating-point type is less than the size of the source floating-point type seems to address the concerns in the original report about telling the user about unexpected conversions that will impact performance. For this example:
There is no floating-point precision or size changes and no warning is emitted. |
@srividya-sundaram, thanks for the summary and changing the PR title. I am good with your changes. |
The report says "Add an optional diagnostic for the use of double-precision ops in kernels", so I don't think this PR addresses it completely. I think code example provided in the issue doesn't demonstrate all possible use DPC++ is supposed to detect. @al42and, am I right? I suggest we remove this paragraph from the description:
And use the first half of the summary from this comment: #6323 (comment). |
@bader I have updated the PR description. |
I guess the question is whether the original request is to diagnose any occurrence of In any case, I think this PR has value as-is and can be merged. It may also be worthwhile to look into enabling these flags by default on SPIR target as @bader suggested above. |
I think what we have here (diagnosing any floating-point conversion that isn't covered by one of the precision warnings) has enough utility to be worth supporting. FWIW, my feeling is that if the request is truly "tell me if I spelled |
You're right. I just provided two examples to illustrate the idea.
My original suggestion was to diagnose all uses. But detecting implicit size change should find the accidental usage that I had in mind, so ok. Also, no strong feelings on my side about opt-out vs opt-in. |
I agree, but I suppose clang-tidy doesn't support SYCL device code outlining and therefore might require significant efforts to make this check device-specific. Right? |
clang-tidy runs over an AST so I'm not certain there's significant efforts involved. That said, SYCL's design and the fact that no one has put in efforts to make it work from clang-tidy means there is some work involved, to be sure. |
@bader Can this PR be merged? The CUDA LLVM test failures seem unrelated to the changes in this PR. |
I guess approval from @AaronBallman is required to merge this PR because changes were requested. @AaronBallman could you please approve if changes look good to you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixes #5783 |
We customized the behavior in #6323 , so we are seeing unexpected warnings in new HLSL tests. This is to disable these customize warnings.
Summary of what this PR does:
Prior to this PR, we had the following 2 warnings for floating-point precision change:
Both the above warnings were turned off by default and only enabled when the corresponding warning flags were passed.
Also, both the warnings were only specific to Clang and were not enabled for SYCL (meaning device code).
If the floating point values remained the same on both the source and target semantics after the cast, no warning was emitted.
Example:
This PR identifies a floating-point size change after an implicit cast and triggers a warning when the -Wimplicit-float-size-conversion flag is passed. It is turned off by default.
This warning is enabled for both Clang and SYCL(i.e both host and device code)
The only other case this addresses is, if there is a precision loss and
-Wimplicit-float-size-conversion
is passed but-Wimplicit-float-conversion
is not, we make sure to emit at least a size warning.