Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoBump] Merge with ce7c828e (Aug 30) (16) #369

Open
wants to merge 92 commits into
base: bump_to_fc110202
Choose a base branch
from

Commits on Aug 29, 2024

  1. Configuration menu
    Copy the full SHA
    025f03f View commit details
    Browse the repository at this point in the history
  2. [libc++][NFC] Remove __constexpr_is{nan,finite} (llvm#106205)

    They're never used in `constexpr` functions, so we can simply use
    `std::isnan` and `std::isfinite` instead.
    philnik777 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    a705e8c View commit details
    Browse the repository at this point in the history
  3. [NFC][TableGen] Refactor IntrinsicEmitter code (llvm#106479)

    - Use formatv() and raw string literals to simplify emission code.
    - Use range based for loops and structured bindings to simplify loops.
    - Use const Pointers to Records.
    - Rename `ComputeFixedEncoding` to `ComputeTypeSignature` to reflect
      what the function actually does, cnd change it to return a vector.
    - Use reverse() and range based for loop to pack 8 nibbles into 32-bits.
    - Rename some variables to follow LLVM coding standards.
    - For function memory effects, print human readable effects in comment.
    jurahul authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    032c328 View commit details
    Browse the repository at this point in the history
  4. AArch64: Add tests for atomicrmw fp operations (llvm#103701)

    There were only codegen tests for the fadd vector case,
    so round out the test coverage for the scalar cases
    and all the other operations.
    arsenm authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    4ee2ad2 View commit details
    Browse the repository at this point in the history
  5. [Support] Delete FormatVariadicTest Validate sub-test (llvm#106570)

    - The subtest, if enabled correctly, will fail with assert in Debug
      builds and validation is disabled in Release builds.
    - Hence deleting the test to fix test failures in CI.
    jurahul authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    5048fab View commit details
    Browse the repository at this point in the history
  6. AArch64: Use consistent atomicrmw expansion for FP operations (llvm#1…

    …03702)
    
    Use LLSC or cmpxchg in the same cases as for the unsupported
    integer operations. This required some fixups to the LLSC
    implementatation to deal with the fp128 case.
    
    The comment about floating-point exceptions was wrong,
    because floating-point exceptions are not really exceptions at all.
    arsenm authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    26c3a84 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b5a1b45 View commit details
    Browse the repository at this point in the history
  8. [RISCV] Don't promote f16 FNEG/FABS with Zfhmin/Zhinxmin. (llvm#106474)

    fneg/fabs are not supposed to canonicalize nans. Promoting to f32 will
    go through an fp_extend which will canonicalize. The generic Promote
    handler needs to be removed from LegalizeDAG.
    
    We need to use integer bit manip to clear the bit instead.
    
    Unfortunately, this is going through the stack due to i16 not being a
    legal type. Fixing that will require custom legalization or some other
    generic SelectionDAG change.
    topperc authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    a9ffb71 View commit details
    Browse the repository at this point in the history
  9. AArch64: Delete tests of fp128 atomicrmw fmin/fmax

    These are getting different output on some build hosts for some reason.
    The stack offsets of temporaries are different.
    arsenm committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    e05c224 View commit details
    Browse the repository at this point in the history
  10. [mlir][scf] Allow unrolling loops with integer-typed IV. (llvm#106164)

    SCF loops now can operate on integer-typed IV, thus I'm changing the
    loop unroller correspondingly.
    htyu authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c08c6a7 View commit details
    Browse the repository at this point in the history
  11. [NFC][Support] Eliminate ',' at end of MemoryEffects print (llvm#106545)

    - Eliminate comma at end of a MemoryEffects print.
    - Added basic unit test to validate that.
    jurahul authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    115b876 View commit details
    Browse the repository at this point in the history
  12. [LoopVectorize][X86] amdlibm-calls.ll - add 2/4/8/16 vector widths te…

    …st checks for fallback to llvm intrinsics
    
    Check for cases where there isn't a amdlib call but it still vectorises the math call
    RKSimon committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    81acc84 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    a777a93 View commit details
    Browse the repository at this point in the history
  14. [LTO] Introduce a new class ImportIDTable (llvm#106503)

    The new class implements a deduplication table to convert import list
    elements:
    
      {SourceModule, GUID, Definition/Declaration}
    
    into 32-bit integers, and vice versa.  This patch adds a unit test but
    does not add a use yet.
    
    To be precise, the deduplication table holds {SourceModule, GUID}
    pairs.  We use the bottom one bit of the 32-bit integers to indicate
    whether we have a definition or declaration.
    
    A subsequent patch will collapse the import list hierarchy --
    FunctionsToImportTy holding many instances of FunctionsToImportTy --
    down to DenseSet<uint32_t> with each element indexing into the
    deduplication table above.  This will address multiple sources of
    space inefficiency.
    kazutakahirata authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    bd6531b View commit details
    Browse the repository at this point in the history
  15. [RISCV][TTI] Model cost for insert/extract into illegal types (llvm#1…

    …06440)
    
    We'd previously just deferred to the base implementation, but that more
    or less always returns 1. This underestimates the cost of the
    insert/extract, biases the SLP vectorizer towards forming illegally
    typed vectors, and underestimates the cost of scalarized operations
    (like unaligned scatter/gather).
    preames authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    59f05b6 View commit details
    Browse the repository at this point in the history
  16. [AArch64] Make apple-m4 armv8.7-a again (from armv9.2-a). (llvm#106312)

    This is a partial revert of c66e1d6.  Even though that
    allowed us to declare v9.2-a support without picking up SVE2
    in both the backend and the driver, the frontend itself still
    enabled SVE via the arch version's default extensions.
    
    Avoid that by reverting back to v8.7-a while we look into
    longer-term solutions.
    ahmedbougacha authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    e5e38dd View commit details
    Browse the repository at this point in the history
  17. [ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (

    llvm#86149)
    
    This patch is part of a set of patches that add an `-fextend-lifetimes`
    flag to clang, which extends the lifetimes of local variables and
    parameters for improved debuggability. In addition to that flag, the
    patch series adds a pragma to selectively disable `-fextend-lifetimes`,
    and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
    for this pointers only. All changes and tests in these patches were
    written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
    has handled review and merging. The extend lifetimes flag is intended to
    eventually be set on by `-Og`, as discussed in the RFC
    here:
    
    https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850
    
    This patch implements a new intrinsic instruction in LLVM,
    `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
    and has no effect other than "using" its operand, to ensure that its
    operand remains live until after the fake use. This patch does not emit
    fake uses anywhere; the next patch in this sequence causes them to be
    emitted from the clang frontend, such that for each variable (or this) a
    fake.use operand is inserted at the end of that variable's scope, using
    that variable's value. This patch covers everything post-frontend, which
    is largely just the basic plumbing for a new intrinsic/instruction,
    along with a few steps to preserve the fake uses through optimizations
    (such as moving them ahead of a tail call or translating them through
    SROA).
    
    Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
    SLTozer authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    3d08ade View commit details
    Browse the repository at this point in the history
  18. [VP] Remove VP_PROPERTY_REDUCTION and VP_PROPERTY_CMP [nfc] (llvm#105551

    )
    
    These lists are quite static and several of the parameters are actually
    constant across all users. Heavy use of macros is undesirable, and not
    idiomatic in LLVM, so let's just use the naive switch cases.
    
    I'll probably continue with removing the other property macros. These
    two just happened to be the two I actually had to figure out for an
    unrelated change.
    preames authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    74b4ec1 View commit details
    Browse the repository at this point in the history
  19. Revert "[Analysis] Guard logf128 cst folding"

    This reverts commit 42d3ccc which
    caused a test failure.
    RoboTux committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    eed135f View commit details
    Browse the repository at this point in the history
  20. [llvm-lit] Print environment variables when using env without subcomm…

    …and (llvm#98414)
    
    This patch addresses an issue with lit's internal shell when env is
    without any arguments, it fails with exit code 127 because `env`
    requires a subcommand. This patch addresses the issue by encoding the
    command to properly return environment variables even when no arguments
    are provided.
    
    The error occurred when running the command 
    ` LIT_USE_INTERNAL_SHELL=1 ninja check-llvm`.
    
    fixes: llvm#102383
    This is part of the test cleanups proposed in the RFC: [[RFC] Enabling
    the Lit Internal Shell by
    Default](https://discourse.llvm.org/t/rfc-enabling-the-lit-internal-shell-by-default/80179)
    Harini0924 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    1783924 View commit details
    Browse the repository at this point in the history
  21. [HLSL] Apply NoRecurse attrib to all HLSL functions (llvm#105907)

    Previously, functions named "main" got the NoRecurse attribute
    consistent with the behavior of C++, which HLSL largely follows.
    However, standard recursion is not allowed in HLSL, so all functions
    should really have this attribute. This doesn't prevent recursion, but
    rather signals that these functions aren't expected to recurse.
    
    Practically, this was done so that entry point functions named "main"
    would have all have the same attributes as otherwise identical entry
    points with other names.
    
    This required small changes to the this assignment tests because they no
    longer generate so many attribute sets since more of them match.
    
    related to llvm#105244
    but done to simplify testing for llvm#89806
    pow2clk authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2dc3b50 View commit details
    Browse the repository at this point in the history
  22. [DXIL][test] Fix a few tests now that HLSL functions are internalized (

    …llvm#106437)
    
    These tests have been failing since db279c7 "[HLSL] Change default
    linkage of HLSL functions to internal (llvm#95331)". This presumably went
    unnoticed because they're not run by default since they rely on an
    external tool (dxil-dis).
    bogner authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    ecd65e6 View commit details
    Browse the repository at this point in the history
  23. [VP] Kill VP_PROPERTY_(MEMOP,CASTOP) and simplify _CONSTRAINEDFP [nfc] (

    llvm#105574)
    
    These lists are quite static. Heavy use of macros is undesirable, and
    not idiomatic in LLVM, so let's just use the naive switch cases.
    
    Note that the first two fields in the CONSTRAINEDFP property were
    utterly unused (aside from a C++ test).
    
    In the same vien as llvm#105551.
    
    Once both changes have landed, we'll be left with _BINARYOP which needs
    a bit of additional untangling, and the actual opcode mappings.
    preames authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    2ad782f View commit details
    Browse the repository at this point in the history
  24. [lldb] Add armv7a and armv8a ArchSpecs (llvm#106433)

    armv7a and armv8a are common names for the application subarch for arm.
    
    These names in particular are used in ChromeOS, Android, and a few other
    known applications. In ChromeOS, we encountered a bug where armv7a arch
    was not recognised and segfaulted when starting an executable on an
    arm32 device.
    
    Google Issue Tracker:
    https://issuetracker.google.com/361414339
    ajordanr-google authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    0a00d32 View commit details
    Browse the repository at this point in the history
  25. Revert "[Support] Validate number of arguments passed to formatv()" (l…

    …lvm#106589)
    
    Reverts llvm#105745
    
    Some bots are broken apparently.
    joker-eph authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    ed37b5f View commit details
    Browse the repository at this point in the history
  26. libcxx: [NFC] relax error expectation for clang diagnostics (llvm#106591

    )
    
    This is a split-off from llvm#96023, where this change has already been
    reviewed by libcxx maintainers.
    
    This will prevent that PR from triggering libcxx-ci from now on.
    mizvekov authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    67ffd14 View commit details
    Browse the repository at this point in the history
  27. Revert "Revert "[Support] Validate number of arguments passed to form…

    …atv()"" (llvm#106592)
    
    Reverts llvm#106589
    The fix for bot failures caused by the reverted commit was committed
    already, so this revert is not needed.
    jurahul authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    9ce4af5 View commit details
    Browse the repository at this point in the history
  28. [ExtendLifetimes][NFC] Add explicit triple to new fake-use tests

    Several tests for the new fake use intrinsic are failing on NVPTX
    buildbots due to relying on behaviour for their expected triple;
    this commit adds that triple to each of them to prevent failures.
    
    Fixes commit 3d08ade (llvm#86149).
    
    Example buildbot failures:
    https://lab.llvm.org/buildbot/#/builders/160/builds/4175
    https://lab.llvm.org/buildbot/#/builders/180/builds/4173
    SLTozer committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    9a58b12 View commit details
    Browse the repository at this point in the history
  29. [SLP] Extract isIdentityOrder to common routine [probably NFC] (llvm#…

    …106582)
    
    This isn't quite just code motion as the four different versions we had
    of this routine differed in whether they ignored the "size" marker used
    to represent undef. I doubt this matters in practice, but it is a
    functional change.
    
    ---------
    
    Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
    preames and alexey-bataev authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    4bc7c74 View commit details
    Browse the repository at this point in the history
  30. [DirectX] add enum for PSV resource type/kind/flag. (llvm#106227)

    Add ResourceType, ResourceKind and ResourceFlag enum class for PSV
    resource.
    
    This is for llvm#103275
    python3kgae authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    fd0dbc7 View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    1ace91f View commit details
    Browse the repository at this point in the history
  32. [flang][cuda] Avoid generating cuf.data_transfer in OpenACC region (l…

    …lvm#106435)
    
    `cuf.data_transfer` will be converted to runtime calls to cuda runtime
    api and these are not supported in device code. assignment in OpenACC
    region will be handled by the OpenACC code gen so we avoid to generate
    data transfer on them.
    clementval authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    0a41c8e View commit details
    Browse the repository at this point in the history
  33. [NFC] [DSE] Refactor DSE (llvm#100956)

    Refactor DSE with MemoryDefWrapper and MemoryLocationWrapper.
    
    Normally, one MemoryDef accesses one MemoryLocation. With "initializes"
    attribute, one MemoryDef (like call instruction) could initialize
    multiple MemoryLocations.
    
    Refactor DSE as a preparation to apply "initializes" attribute in DSE in
    a follow-up PR
    (llvm@58dd8a4).
    haopliu authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    6421dcc View commit details
    Browse the repository at this point in the history
  34. [RISCV][SLP] Test for <3 x Ty> reductions which require reordering

    These tests show a vectorizable reduction where the order of the
    reduction has been adjusted so that profitable vectorization requires
    a reordering of the computation.   We currently have no reordering
    in SLP for non-power-of-two vectors, so this doesn't work.
    
    Note that due to reassociation performed in the standard pipeline,
    this is actually the canonical form for a reduction reaching SLP.
    preames committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    22ba351 View commit details
    Browse the repository at this point in the history
  35. AMDGPU: Use pattern to select instruction for intrinsic llvm.fptrunc.…

    …round (llvm#105761)
    
    Use GCNPat instead of Custom Lowering to select instructions for
    intrinsic llvm.fptrunc.round. "SupportedRoundMode : TImmLeaf" is used as
    a predicate to select only when the rounding mode is supported.
    "as_hw_round_mode : SDNodeXForm" is developed to translate the round
    modes to the corresponding ones that hardware recognizes.
    changpeng authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    26b0bef View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    c1248c9 View commit details
    Browse the repository at this point in the history
  37. [CodeGen] Allow mixed scalar type constraints for inline asm (llvm#65465

    )
    
    GCC supports code like "asm volatile ("" : "=r" (i) : "0" (f))" where i
    is integer type and f is floating point type. Currently this code
    produces an error with Clang. The change allows mixed scalar types
    between input and output constraints.
    
    Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
    dfszabo and arsenm authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    e9eaf19 View commit details
    Browse the repository at this point in the history
  38. [NFC][Sema] Move Sema::AssignmentAction into its own scoped enum (l…

    …lvm#106453)
    
    The primary motivation behind this is to allow the enum type to be
    referred to earlier in the Sema.h file which is needed for llvm#106321.
    
    It was requested in llvm#106321 that a scoped enum be used (rather than
    moving the enum declaration earlier in the Sema class declaration).
    Unfortunately doing this creates a lot of churn as all use sites of the
    enum constants had to be changed. Appologies to all downstream forks in
    advanced.
    
    Note the AA_ prefix has been dropped from the enum value names as they
    are now redundant.
    delcypher authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    ff04c5b View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    a0441ce View commit details
    Browse the repository at this point in the history
  40. [libc] Implement locale variants for 'stdlib.h' functions (llvm#105718)

    Summary:
    This provides the `_l` variants for the `stdlib.h` functions. These are
    just copies of the same entrypoint and don't do anything with the locale
    information.
    jhuber6 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    a871051 View commit details
    Browse the repository at this point in the history
  41. [libc] Add support for 'string.h' locale variants (llvm#105719)

    Summary:
    This adds the locale variants of the string functions. As previously,
    these do not use the locale information at all and simply copy the
    non-locale version which expects the "C" locale.
    jhuber6 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    5c019bd View commit details
    Browse the repository at this point in the history
  42. Configuration menu
    Copy the full SHA
    ba5e8fc View commit details
    Browse the repository at this point in the history
  43. [AMDGPU][True16][MC] add true16/fake16 flag to gfx12 dasm tests (llvm…

    …#106469)
    
    add true16/fake16 flag to gfx12 dasm tests including vop1, vop1_dpp,
    vop3_from_vop1 and vop3_from_vop1_dpp. This is a test only change.
    broxigarchen authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    74938ab View commit details
    Browse the repository at this point in the history
  44. [RISCV] Add coverage for <3 x float> reduction with neutral start

    We can do slightly better on the neutral value when we have nsz.
    preames committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    59762a0 View commit details
    Browse the repository at this point in the history
  45. Configuration menu
    Copy the full SHA
    d5c292d View commit details
    Browse the repository at this point in the history
  46. [SLP]Correctly decide if the non-power-of-2 number of stores can be v…

    …ectorized.
    
    Need to consider the maximum type size in the graph before doing attempt
    for the vectorization of non-power-of-2 number of elements, which may be
      less than MinVF.
    alexey-bataev committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    aeedab7 View commit details
    Browse the repository at this point in the history
  47. [HWASan] remove incorrectly inferred attributes (llvm#106565)

    assume all functions used in a HWASan module potentially touch shadow
    memory (and short granules).
    fmayer authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    f08f9cd View commit details
    Browse the repository at this point in the history
  48. [analyzer] Fix nullptr dereference for symbols from pointer invalidat…

    …ion (llvm#106568)
    
    As reported in
    llvm#105648 (comment)
    commit 08ad8dc
    introduced a nullptr dereference in the case when store contains a
    binding to a symbol that has no origin region associated with it, such
    as the symbol generated when a pointer is passed to an opaque function.
    necto authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    0141a3c View commit details
    Browse the repository at this point in the history
  49. Configuration menu
    Copy the full SHA
    66927fb View commit details
    Browse the repository at this point in the history
  50. [VPlan] Use skipCostComputation when pre-computing induction costs.

    This ensures we skip any instructions identified to be ignored by the
    legacy cost model as well. Fixes a divergence between legacy and
    VPlan-based cost model.
    
    Fixes llvm#106417.
    fhahn committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    c490658 View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    1f0d545 View commit details
    Browse the repository at this point in the history
  52. [NFC][Clang] Avoid potential null pointer dereferences in Sema::AddIn…

    …itializerToDecl(). (llvm#106235)
    
    Control flow analysis performed by a static analysis tool revealed the
    potential for null pointer dereferences to occur in conjunction with the
    `Init` parameter in `Sema::AddInitializerToDecl()`. On entry to the
    function, `Init` is required to be non-null as there are multiple
    potential branches that unconditionally dereference it. However, there
    were two places where `Init` is compared to null thus implying that
    `Init` is expected to be null in some cases. These checks appear to be
    purely defensive checks and thus unnecessary. Further, there were
    several cases where code checked `Result`, a variable of type
    `ExprResult`, for an invalid value, but did not check for a valid but
    null value and then proceeded to unconditionally dereference the
    potential null result. This change elides the unnecessary defensive
    checks and changes some checks for an invalid result to instead branch
    on an unusable result (either an invalid result or a valid but null
    result).
    tahonermann authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    049b60c View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    593526f View commit details
    Browse the repository at this point in the history
  54. [GlobalISel] Add bail outs for scalable vectors to some combines. (ll…

    …vm#106496)
    
    These combines call getNumElements() which isn't valid for scalable
    vectors.
    topperc authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    4ca817d View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    1827086 View commit details
    Browse the repository at this point in the history
  56. [ExtendLifetimes][NFC] Add explicit triple to remaining fake-use tests

    One of the tests for the new fake use intrinsic are failing on darwin
    buildbots due to relying on behaviour for their expected triple; this
    commit adds explicit triples to the few remaining fake-use tests that
    didn't have them.
    
    Fixes commit 3d08ade (llvm#86149).
    
    Buildbot failures:
    https://lab.llvm.org/buildbot/#/builders/23/builds/2505
    SLTozer committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    412e3e3 View commit details
    Browse the repository at this point in the history
  57. [clang] mangle placeholder for deduced type as a template-prefix (llv…

    …m#106335)
    
    As agreed on itanium-cxx-abi/cxx-abi#109 these
    placeholders should be mangled as a `template-prefix` production.
    
    ```
        <template-prefix> ::= <template unqualified-name>           # global template
                          ::= <prefix> <template unqualified-name>  # nested template
                          ::= <template-param>                      # template template parameter
                          ::= <substitution>
    ```
    
    Previous to this patch, the template template parameter case was not
    handled, and template template parameters were incorrectly being handled
    as unqualified-names.
    
    Before llvm#95202, DeducedTemplateType was not canonicalized correctly, so
    that template template parameter declarations were retained
    uncanonicalized.
    
    After llvm#95202, they are correctly canonicalized, but this now leads to
    these TTPs being anonymous entities, where the mangling implementation
    correctly doesn't expect an anonymous declaration of this kind, leading
    to a crash.
    
    Fixes llvm#106182.
    mizvekov authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    7284e0f View commit details
    Browse the repository at this point in the history
  58. [clang][test] Rewrote test using command substitution to work with li…

    …t internal shell syntax (llvm#105902)
    
    This patch rewrites a test that uses command substitution `$()` and the
    `stat` command, which are not supported by lit's internal shell. Instead
    of using this syntax to perform the file size comparison done in this
    test, a Python script is used instead to perform the same operation.
    
    Fixes llvm#102384.
    connieyzhu authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    4caf019 View commit details
    Browse the repository at this point in the history
  59. [LegalizeVectorOps][PowerPC] Use xor to expand fneg. (llvm#106595)

    This preserves the semantis of fneg and matches what we do in
    LegalizeDAG.
    
    I kept the legal FSUB check to force unrolling for some targets that
    don't have FSUB but have XOR. On Aarch64, using xor broke some tests that
    expected to see a (v1f64 (fma (insertvector_elt (f64 (fneg
    (extractvectorelt X)))))) pattern.
    topperc authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    aa91d90 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    e51fc36 View commit details
    Browse the repository at this point in the history
  61. [SLP]Fix PR106626: trye several attempts for lookup values, if not fo…

    …und.
    
    If the value is used in Scalar several times, the first attempt to find
    its position in the node (if ReuseShuffleIndices and ReorderIndices not
    empty) may fail. In this case need to find another copy of the same
    value and try again.
    Fixes llvm#106626
    alexey-bataev committed Aug 29, 2024
    Configuration menu
    Copy the full SHA
    cc943a6 View commit details
    Browse the repository at this point in the history
  62. [flang][Driver] Add support for -mllvm -print-pipeline-passes

    The behavior deliberately mimics that of clang. Ideally, -print-pipeline-passes
    should be a first-class driver option. Notes to this effect have been added in 
    the appropriate places in both flang and clang.
    
    ---------
    
    Co-authored-by: Tarun Prabhu <tarun.prabhu@gmail.com>
    tarunprabhu and Tarun Prabhu authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    a1441ca View commit details
    Browse the repository at this point in the history
  63. [NVPTX] fixup incorrect rounding mode for int to float conversion (ll…

    …vm#106600)
    
    `uitofp` and `sitofp` instructions use the default rounding mode which
    is defined as round-to-nearest.
    AlexMaclean authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    dac1f7b View commit details
    Browse the repository at this point in the history
  64. [compiler-rt] Work around incompatible Windows definitions of (S)SIZE_T

    The interceptor types are supposed to match size_t (and the non-Windows
    ssize_t) exactly, but on 32-bit Windows `size_t` uses `unsigned int`
    whereas `SIZE_T` is `unsigned long`. The current definition results in
    `uptr` not matching `uintptr_t` since we otherwise get typedef
    redefinition errors. Work around this by using a #define instead of
    a typedef when defining SIZE_T.
    
    It would probably be cleaner to stop using these uppercase types, but
    that is a rather invasive change and this one is the minimal change to
    allow uptr to match uintptr_t on Windows.
    
    To ensure this compiles on Windows, we also remove the interceptor.h
    defines of uptr (that do not always match __sanitizer::uptr) and rely
    on __sanitizer::uptr instead. The interceptor types most likely predate
    those other types so clean up the unnecessary definition while here.
    
    This also reverts commit 18e06e3 and
    commit bb27dd8.
    
    Reviewed By: mstorsjo, vitalybuka
    
    Pull Request: llvm#106311
    arichardson authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    ec68dc1 View commit details
    Browse the repository at this point in the history
  65. [compiler-rt] Remove duplicates of sanitizer_common functions

    These functions in interception_win.cpp already exist in
    sanitizer_common. Use those instead.
    
    Reviewed By: mstorsjo
    
    Pull Request: llvm#106488
    arichardson authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    9df92cb View commit details
    Browse the repository at this point in the history
  66. [RISCV] Separate ActiveElementsAffectResult into VL and Mask flags (l…

    …lvm#106517)
    
    In llvm#106110 we had to mark v[f]slide1down.vx as
    ActiveElementsAffectResult since the elements in the body depend on VL.
    However it doesn't depend on the mask, so this was overly conservative
    and broke the vmerge peephole.
    
    We can recover this by splitting up ActiveElementsAffectResult into VL
    and Mask bits, so we can more accurately model v[f]slide1down.vx and
    re-enable the peephole.
    lukel97 authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    dbbfc95 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2024

  1. [SandboxIR] Implement SandboxIR Type (llvm#106294)

    This patch implements sandboxir::Type, a thin wrapper of llvm::Type.
    This is designed very similarly to sandbox::Value. Context owns all
    sandboxir::Type objects and maintains a map between llvm::Type and
    sandboxir::Type.
    
    There are a couple of reasons for migrating from llvm::Type to
    sandboxir::Type:
    - Creating an llvm::Type from within SandboxIR-only code doesn't work
    well because it requires you to pass llvm::Context to functions like
    llvm::Type::getInt32Ty(C), but you wouldn't normally have access to
    llvm::Context C. In unit tests this is not such a big deal because you
    have access to both, but it will become an issue in SandboxIR-only code.
    - Not being able to get the sandboxir::Context from llvm::Type results
    in awkward sandboir APIs with additional sandboxir::Context arguments.
    - llvm::Type::getContext() can basically give you access to the whole
    LLVM IR, which we should try to avoid.
    vporpo authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    034f2b3 View commit details
    Browse the repository at this point in the history
  2. Revert "[compiler-rt] Remove duplicates of sanitizer_common functions"

    This works for MinGW, but the MSVC linker apparently doens't pull in
    those symbols. Reverting for now since I won't be able to reproduce it today.
    
    https://lab.llvm.org/buildbot/#/builders/107/builds/2337
    
    This reverts commit 9df92cb.
    arichardson committed Aug 30, 2024
    Configuration menu
    Copy the full SHA
    46fe36a View commit details
    Browse the repository at this point in the history
  3. [NVPTX] Fix crash caused by ComputePTXValueVTs (llvm#104524)

    When [lowering return
    values](https://github.com/llvm/llvm-project/blob/99a10f1fe8a7e4b0fdb4c6dd5e7f24f87e0d3695/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp#L3422)
    from LLVM IR to SelectionDAG, we check that [the number of values
    `SelectionDAG` tells us to return is equal to the number of values that
    `ComputePTXValueVTs()` tells us to
    return](https://github.com/llvm/llvm-project/blob/99a10f1fe8a7e4b0fdb4c6dd5e7f24f87e0d3695/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp#L3441).
    However, this check can fail on valid IR. For example:
    
    ```
    define <6 x half> @foo() {
      ret <6 x half> zeroinitializer
    }
    ```
    
    `ComputePTXValueVTs()` tells us to return ***3*** `v2f16` values, while
    `SelectionDAG` tells us to return ***6*** `f16` values. Thus, the
    compiler will crash.
    
    `ComputePTXValueVTs()` [supports all `half` element vectors with an even
    number of
    elements](https://github.com/llvm/llvm-project/blob/99a10f1fe8a7e4b0fdb4c6dd5e7f24f87e0d3695/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp#L213).
    Whereas `SelectionDAG` [only supports power-of-2 sized
    vectors](https://github.com/llvm/llvm-project/blob/4e078e3797098daa40d254447c499bcf61415308/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1580).
    This is the root of the discrepancy.
    
    Assuming that the developers who added the code to
    `ComputePTXValueVTs()` overlooked this, I've restricted
    `ComputePTXValueVTs()` to compute the same number of return values as
    `SelectionDAG`, instead of extending `SelectionDAG` to support
    non-power-of-2 sized vectors.
    justinfargnoli authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    cdaebf6 View commit details
    Browse the repository at this point in the history
  4. Reapply "[nfc][mlgo] Incrementally update DominatorTreeAnalysis in Fu…

    …nctionPropertiesAnalysis (llvm#104867) (llvm#106309)
    
    Reverts c992690.
    
    The problem is that if there is a sequence "{delete A->B} {delete A->B}
    {insert A->B}" the net result is "{delete A->B}", which is not what we
    want.
    
    Duplicate successors may happen in cases like switch statements (as
    shown in the unit test).
    
    The second problem was that in `invoke` cases, some edges we speculate may get deleted don't, but are also not reachable from the inlined call site's basic block. We just need to check which edges are actually not present anymore.
    
    The fix is to sanitize the list of deletes, just like we do for inserts.
    mtrofin authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    1991aa6 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    7579787 View commit details
    Browse the repository at this point in the history
  6. [RISCV][TTI] Add legality check of vector of address for gather/scatt…

    …er. (llvm#106481)
    
    This patch add a legality check that checks if target machine support
    vector of address in `isLegalMaskedGatherScatter()`.
    ElvisWang123 authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    e29c5f3 View commit details
    Browse the repository at this point in the history
  7. Reapply "[HWASan] remove incorrectly inferred attributes" (llvm#106622)…

    … (llvm#106624)
    
    This reverts commit 66927fb.
    
    Fixed clang tests
    fmayer authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    12b0257 View commit details
    Browse the repository at this point in the history
  8. [C++20] [Modules] Skip checking ODR for merged context in GMF

    Solve clangd/clangd#2094
    
    Due clangd will enable PCH automatically, the previous mechanism to skip
    ODR check in GMF may be invalid. This patch fixes this for a case.
    ChuanqiXu9 committed Aug 30, 2024
    Configuration menu
    Copy the full SHA
    ca2351d View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    ddaf2e2 View commit details
    Browse the repository at this point in the history
  10. [NVPTX][AA] Traverse use-def chain to find non-generic addrspace (llv…

    …m#106477)
    
    Address space information may be encoded anywhere along the use-def
    chain. Take advantage of this by traversing the chain until we find a
    non-generic addrspace.
    AlexMaclean authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    e004566 View commit details
    Browse the repository at this point in the history
  11. [HLSL] Add HLSLAttributedResourceType (llvm#106181)

    Introducing `HLSLAttributedResourceType` - a new type that is similar to
    `AttributedType` but with additional data specific to HLSL resources.
    `AttributeType` currently only stores an attribute kind and no
    additional data from the type attribute parameters. This does not really
    work for HLSL resources since its type attributes contain non-boolean
    values that need to be retained as well.
    
    For example:
    
    ```
    template <typename T> class RWBuffer {
      __hlsl_resource_t  [[hlsl::resource_class(uav)]] [[hlsl::is_rov]] handle;
    };
    ```
    
    The data `HLSLAttributedResourceType` needs to eventually store are:
    - resource class (SRV, UAV, CBuffer, Sampler)
    - texture dimension(1-3)
    - flags is_rov, is_array, is_feedback and is_multisample
    - contained type
    
    All of these values except contained type will be stored in
    `HLSLAttributedResourceType::Attributes` struct and accessed
    individually via the fields. There is also `Data` alias that covers all
    of these values as a `unsigned` which is used for hashing and the AST
    type serialization.
    
    During type attribute processing all HLSL type attributes will be
    validated and collected by SemaHLSL (by
    `SemaHLSL::handleResourceTypeAttr`) and in the end combined into a
    single `HLSLAttributedResourceType` instance (in
    `SemaHLSL::ProcessResourceTypeAttributes`). `SemaHLSL` will also need to
    short-term store the `TypeLoc` information for the new type that will be
    grabbed by `TypeSpecLocFiller` soon after the type is created.
    
    Part 1/2 of llvm#104861
    hekota authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    e00e9a3 View commit details
    Browse the repository at this point in the history
  12. [flang][cuda] Do inline allocation/deallocation in device code (llvm#…

    …106628)
    
    ALLOCATE and DEALLOCATE statements can be inlined in device function.
    This patch updates the condition that determined to inline these actions
    in lowering.
    
    This avoid runtime calls in device function code and can speed up the
    execution.
    
    Also move `isCudaDeviceContext` from `Bridge.cpp` so it can be used
    elsewhere.
    clementval authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    d4c519e View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    24e791b View commit details
    Browse the repository at this point in the history
  14. [flang][fir] allow fir.convert from and to !llvm.ptr type (llvm#106590)

    Allow some interaction between LLVM and FIR dialect by allowing
    conversion between FIR memory types and llvm.ptr type.
    This is meant to help experimentation where FIR and LLVM dialect
    coexists, and is useful to deal with cases where LLVM type makes it
    early into the MLIR produced by flang, like when inserting LLVM stack
    intrinsic here:
    https://github.com/llvm/llvm-project/blob/0a00d32c5f88fce89006dcde6e235bc77d7b495e/flang/lib/Optimizer/Transforms/StackReclaim.cpp#L57
    jeanPerier authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    b65fc7e View commit details
    Browse the repository at this point in the history
  15. [flang][acc] allow and ignore DIR between ACC and loops (llvm#106522)

    The current pattern was failing OpenACC semantics in acc parse tree
    canonicalization:
    
    ```
    !acc loop
    !dir vector aligned
    do i=1,n
    ...
    ```
    
    Fix it by moving the directive before the OpenACC construct node.
    
    Note that I think it could make sense to propagate the $dir info to the
    acc.loop, at least with classic flang, the $dir seems to make a
    difference. This is not done here since few directives are supported
    anyway.
    jeanPerier authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    a527248 View commit details
    Browse the repository at this point in the history
  16. [NFC] [clangd] [Modules] Extract ModuleFile class and IsModuleFileUpT…

    …oDate function
    
    This patch extracts ModuleFile class from StandalonePrerequisiteModules
    so that we can reuse it further. And also we implement
    IsModuleFileUpToDate function to implement
    StandalonePrerequisiteModules::CanReuse. Both of them aims to ease the
    future improvements to the support of modules in clangd. And both of
    them should be NFC.
    ChuanqiXu9 committed Aug 30, 2024
    Configuration menu
    Copy the full SHA
    448d8fa View commit details
    Browse the repository at this point in the history
  17. [compiler-rt][AArch64][Android] Use getauxval on Android. (llvm#102979)

    __getauxval is a libgcc function that doesn't exist on Android.
    Also on Linux let's use getauxval as it is anyway used other places in compiler-rt.
    DanielKristofKiss authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    cd634f5 View commit details
    Browse the repository at this point in the history
  18. [NFC] [clangd] [Modules] Change the argument type of IsModuleFileUpTo…

    …Date to reference
    
    It is better to use references instead of pointers as the argument type
    of IsModuleFileUpToDate. Since the PrerequisiteModules is always
    expected to exist.
    ChuanqiXu9 committed Aug 30, 2024
    Configuration menu
    Copy the full SHA
    d68059b View commit details
    Browse the repository at this point in the history
  19. [flang][openacc] parse and ignore non-standard shortloop clause (llvm…

    …#106564)
    
    shortloop is a non standard OpenACC extension
    (https://docs.nvidia.com/hpc-sdk/pgi-compilers/2015/pgirn157.pdf) that
    can be found on loop directives.
    
    f18 parser was choking when seeing it. Since it can be found in existing
    apps and is mainly an optimization hint, parse it on loop directives and
    ignore it with a warning.
    
    For the records, here is shortloop meaning according to the manual linked above:
    
    "If the shortloop clause appears on a loop directive with the vector clause, it tells the compiler that the
    loop trip count is less than or equal to the number of vector lanes created for that loop. This means the
    value of the vector() clause on the loop directive in a kernels region, or the value of the
    vector_length() clause on the parallel directive in a parallel region will be greater than or
    equal to the loop trip count. This allows the compiler to generate more efficient code for the loop"
    jeanPerier authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    8ca6401 View commit details
    Browse the repository at this point in the history
  20. [NFC] Add explicit #include llvm-config.h where its macros are used. (l…

    …lvm#106621)
    
    Without these explicit includes, removing other headers, who implicitly
    include llvm-config.h, may have non-trivial side effects.
    dfukalov authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    89e6a28 View commit details
    Browse the repository at this point in the history
  21. [IPSCCP] Infer nonnull return attribute (llvm#106553)

    Similarly to the existing range attribute inference, also infer the
    nonnull attribute on function return values.
    
    I think in practice FunctionAttrs will handle nearly all cases, the main
    one I think it doesn't is cases involving branch conditions. But as we
    already have the information here, we may as well materialize it.
    nikic authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    d6ad551 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    e0fa2f1 View commit details
    Browse the repository at this point in the history
  23. [AArch64][SelectionDAG] Vector splitting and promotion for histogram …

    …intrinsic (llvm#103037)
    
    Adds support for wider-than-legal vector types for the histogram
    intrinsic (llvm.experimental.vector.histogram.add) by splitting the
    vector. Also adds integer promotion for the Inc operand.
    DevM-uk authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    1693d8e View commit details
    Browse the repository at this point in the history
  24. [EdgeBundles] Correct MBB label name in output graph when `-view-edge…

    …-bundles`. NFC. (llvm#106661)
    
    With `-view-edge-bundles`, before the change, the dot file output is
    kinda like
    ```dot
    digraph {
            "%bb.0" [ shape=box ]
            0 -> "%bb.0"
            "%bb.0" -> 1
            "%bb.0" -> "%bb.1" [ color=lightgray ]
            "%bb.0" -> "%bb.6" [ color=lightgray ]
            "%bb.1" [ shape=box ]
            1 -> "%bb.1"
            "%bb.1" -> 1
            "%bb.1" -> "%bb.2" [ color=lightgray ]
            "%bb.1" -> "%bb.6" [ color=lightgray ]
            "%bb.2" [ shape=box ]
            1 -> "%bb.2"
            "%bb.2" -> 1
            "%bb.2" -> "%bb.3" [ color=lightgray ]
            "%bb.3" [ shape=box ]
            1 -> "%bb.3"
            "%bb.3" -> 2
            "%bb.3" -> "%bb.4" [ color=lightgray ]
            "%bb.4" [ shape=box ]
            2 -> "%bb.4"
            "%bb.4" -> 2
            "%bb.4" -> "%bb.4" [ color=lightgray ]
            "%bb.4" -> "%bb.5" [ color=lightgray ]
            "%bb.5" [ shape=box ]
            2 -> "%bb.5"
            "%bb.5" -> 1
            "%bb.5" -> "%bb.6" [ color=lightgray ]
            "%bb.5" -> "%bb.3" [ color=lightgray ]
            "%bb.6" [ shape=box ]
            1 -> "%bb.6"
            "%bb.6" -> 3
    }
    ```
    However, the graph output by graphviz is
    
    ![t](https://github.com/user-attachments/assets/24056c0a-3ba9-49c3-a5da-269f3140e619)
    The node name corresponding to the MBB is incorrect.
    After the change, the node name is consistent with MBB's name.
    
    ![s](https://github.com/user-attachments/assets/38c649d1-7222-4de1-971c-56f7721ab64c)
    bzEq authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    d2b8969 View commit details
    Browse the repository at this point in the history
  25. [lldb][AArch64] Do not crash if NT_ARM_TLS is missing (llvm#106478)

    [D156118](https://reviews.llvm.org/D156118) states that this note is
    always present, but it is better to check it explicitly, as otherwise
    `lldb` may crash when trying to read registers.
    igorkudrin authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    ce7c828 View commit details
    Browse the repository at this point in the history

Commits on Sep 25, 2024

  1. Configuration menu
    Copy the full SHA
    27fa658 View commit details
    Browse the repository at this point in the history