Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM and SPIRV-LLVM-Translator pulldown (WW49) #2836

Merged
merged 567 commits into from
Dec 1, 2020

Conversation

vmaksimo
Copy link
Contributor

@vmaksimo vmaksimo commented Nov 30, 2020

zygoloid and others added 30 commits November 24, 2020 16:25
Previously we only considered using a substitution for a template-name
after already having mangled its prefix, so we'd produce nonsense
manglings like NS3_S4_IiEE where we should simply produce NS4_IiEE.

This is not ABI-compatible with previous Clang versions, and the old
behavior is restored by -fclang-abi-compat=11.0 or earlier.
Enables overriding earlier --lto-whole-program-visibility.

Variant of D91583 while discussing alternate ways to identify and
handle the --export-dynamic case.

Differential Revision: https://reviews.llvm.org/D92060
This CL adds the ability to request different parallelization strategies
for the generate code. Every "parallel" loop is a candidate, and converted
to a parallel op if it is an actual for-loop (not a while) and the strategy
allows dense/sparse outer/inner parallelization.

This will connect directly with the work of @ezhulenev on parallel loops.

Still TBD: vectorization strategy

Reviewed By: penpornk

Differential Revision: https://reviews.llvm.org/D91978
…ess identifier naming checks

The idea of suppressing naming checks for variables is to support code bases that allow short variables named e.g 'x' and 'i' without prefix/suffixes or casing styles. This was originally proposed as a 'ShortSizeThreshold' however has been made more generic with a regex to suppress identifier naming checks for those that match.

Reviewed By: njames93, aaron.ballman

Differential Revision: https://reviews.llvm.org/D90282
MSVC seems to think this `friend class TrailingObjects;` declaration is
declaring a TrailingObjects class instead of naming the injected base
class. Remove `class` so it does the right thing.
Based on D91043 by Luís Marques. Thanks Luís!

Differential Revision: https://reviews.llvm.org/D91043
…ouble on Power8

For now, we are using the GPR to pass the arguments/return value for fp128 on Power8,
which is incorrect. It should be VSR. The reason why we do it this way is that,
we are setting the fp128 as illegal which make LLVM try to emulate it with i128 on
Power8. So, we need to correct it as legal.

Reviewed By: Nemanjai

Differential Revision: https://reviews.llvm.org/D91527
Typically branch_weights are i32, not i64.
This fixes entry_counts_cold.ll under NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90539
Some older code - and code copied from older code - still directly tested against the singelton result of SE::getCouldNotCompute.  Using the isa<SCEVCouldNotCompute> form is both shorter, and more readable.
Also ensure the -cc1 argument is actually part of the clang -cc1 command
line rather than some unrelated command line.
Tests that pass -mlinker-version=old version and that then don't
expect new flags to be passed need to explicitly request the system
linker now.
This patch is the initial patch for support of the AIX extended vector ABI.  The extended ABI treats vector registers V20-V31 as non-volatile and we add them as callee saved registers in this patch.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D88676
…oesNodeExist` helper

`SimplifySetCC` invokes `getNodeIfExists` without passing `Flags` argument and `getNodeIfExists` uses a default `SDNodeFlags` to intersect the original flags, as a consequence, flags like `nsw` is dropped. Added a new helper function `doesNodeExist` to check if a node exists without modifying its flags.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D89938
When deciding to widen narrow use, we may need to prove some facts
about it. For proof, the context is used. Currently we take the instruction
being widened as the context.

However, we may be more precise here if we take as context the point that
dominates all users of instruction being widened.

Differential Revision: https://reviews.llvm.org/D90456
Reviewed By: skatkov
This matches the legacy PM's EP_ModuleOptimizerEarly. Some backends use
this extension point and adding the pass somewhere else like
PipelineStartEPCallback doesn't work.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D91804
PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root.
LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to
target that has test instructions.

Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang

Differential Revision: https://reviews.llvm.org/D80706
…ndDuringFirstIterations"

This reverts commit 7dcc889.

This patch introduced a logical error that breaks whole logic of this analysis.
All checks we are making are supposed to be loop-independent, so that we could
safely remove the range check. The 'nw' fact is loop-dependent, so we can remove
the check basing on facts from this very check.

Motivating examples will follow-up.
During reviewing https://reviews.llvm.org/D84419, @efriedma mentioned the gap between realigned stack pointer and origin stack pointer should be probed too whatever the alignment is. This patch fixes the issue for PPC64.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D88078
We have a similar logic for LLVM/GNU styles that can be deduplicated.
This will allow to replace `reportError` calls with `reportUniqueWarning`
calls in a single place.

Differential revision: https://reviews.llvm.org/D92018
This converts the VPReductionRecipe into a VPValue, like other
VPRecipe's in preparation for traversing def-use chains. It also makes
it a VPUser, now storing the used VPValues as operands.

It doesn't yet change how the VPReductionRecipes are created. It will
need to call replaceAllUsesWith from the original recipe they replace,
but that is not done yet as VPWidenRecipe need to be created first.

Differential Revision: https://reviews.llvm.org/D88382
Similar to other patches, this makes VPWidenRecipe a VPValue. Because of
the way it interacts with the reduction code it also slightly alters the
way that VPValues are registered, removing the up front NeedDef and
using getOrAddVPValue to create them on-demand if needed instead.

Differential Revision: https://reviews.llvm.org/D88447
AVR and PPC64 bots reports link errors:
(http://lab.llvm.org:8011/#/builders/112/builds/1522)
(http://lab.llvm.org:8011/#/builders/52/builds/1764)

/tmp/cclOvLx0.s: Assembler messages:
/tmp/cclOvLx0.s:9223: Error: symbol `_ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/cclOvLx0.s:9227: Error: symbol `.L._ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/cclOvLx0.s:10272: Error: symbol `_ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/cclOvLx0.s:10276: Error: symbol `.L._ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/cclOvLx0.s:10285: Error: symbol `_ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/cclOvLx0.s:10289: Error: symbol `.L._ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined

/tmp/ccFJYr6I.s: Assembler messages:
/tmp/ccFJYr6I.s:6284: Error: symbol `_ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/ccFJYr6I.s:7053: Error: symbol `_ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined
/tmp/ccFJYr6I.s:7093: Error: symbol `_ZN4llvm12function_refIFvvEE11callback_fnIUlvE2_EEvl' is already defined

I *guess* the reason might be the default lambda argument. I've removed it.
It is possible that some write resource is variant in model A
and sequence in model B. Such case will trigger assertion in
getAllPredicates function.
Currently we never dump the `sh_offset` key.
Though it sometimes an important information.

To reduce the noise this patch implements the following logic:
1) The "Offset" key for the first section is always emitted.
2) If we can derive the offset for a next section naturally,
   then the "Offset" key is omitted.

By "naturally" I mean that section[X] offset is expected to be:
```
offsetOf(section[X]) == alignTo(section[X - 1].sh_offset + section[X - 1].sh_size, section[X].sh_addralign)
```

So, when it has the expected value, we omit it from the output.

Differential revision: https://reviews.llvm.org/D91152
fhahn and others added 19 commits November 27, 2020 17:01
This patch updates widenGEP to manage the resulting vector values using
the VPValue of VPWidenGEP recipe.
Added two new workflows: for in-tree and out-of-tree configurations.
Each of them features building and testing the translator for each push
or pull request to master and release branches and nightly build as
well.

In-tree workflow also enables builds on Windows and macOS.
At this point LoopControlParameters can be empty only if a single
metadata llvm.loop.ivdep.enable is attached to a branch instruction.

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
* Align with llvm.org on sret/byval parameter attribute types

With the latest community changes, the StructRet/ByVal parameter
attributes are now required to have a type. Fix the build by:
1. updating the LIT IR to match this policy;
2. adding correct attribute types in backwards translation.

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
- Ensure correct naming of `LLVM_SPIRV_INCLUDE_TESTS` in the
  documentation;
- Clarify the usage scenario of the `LLVM_EXTERNAL_LIT` variable.

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
The original implementation of IVDep for kernel closure parameters
(KhronosGroup/SPIRV-LLVM-Translator@f0e3545e1f)
has only resolved array parameters. When the attribute is applied
to a pointer instead, an additional load instruction gets emitted
for each access by the FE compiler. Accesses to different load
results have been treated as different arrays, while still referring
to the same memory. As a consequence, accesses to the same pointer
have been marked into different `!llvm.index.group` metadata nodes -
essentially, this issue is what the new patch aims to resolve.

Additionally, the IR for the test was bumped up to the new LLVM
version.

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Added translation of VCFCEntry kernel attribute to execution mode
VectorComputeFastCompositeKernelINTEL
Otherwise we can't make any assumptions about LoopControl parameters
order.

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
LLVM IR "store atomic" and "load atomic" instructions are now lowered
to the corresponding SPIR-V atomics.

Signed-off-by: Andrew Savonichev <andrew.savonichev@intel.com>
Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
The new loop control bit NoFusionINTEL is enabled by the FPGALoopControlsINTEL
capability. If this bit is set, it indicates that the loop should not be fused
with any adjacent loop.

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
Spec: KhronosGroup/SPIRV-Registry#85

Signed-off-by: amochalo <anastasiya.mochalova@intel.com>
The new `AtomicFAddEXT` instruction will be mapped onto
`__spirv_AtomicFAddEXT()` external calls in LLVM IR. No
additional logic is required to facilitate this - the
existing infrastructure does the replacement by default
based on the `__spirv` prefix.

The full specification can be found at
github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/EXT/SPV_EXT_shader_atomic_float_add.asciidoc

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
@bader
Copy link
Contributor

bader commented Nov 30, 2020

/summary:run

@vladimirlaz vladimirlaz merged commit 20cb0da into intel:sycl Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.