Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][Graph] Update doc for UR PR moving reset commands to a dedicated cmd-list #357

Closed
wants to merge 744 commits into from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Feb 13, 2024

  1. [mlir][nvgpu] Make phaseParity of mbarrier.try_wait i1 (#81460)

    Currently, `phaseParity` argument of `nvgpu.mbarrier.try_wait.parity` is
    index. This can cause a problem if it's passed any value different than
    0 or 1. Because the PTX instruction only accepts even or odd phase. This
    PR makes phaseParity argument i1 to avoid misuse.
    
    Here is the information from PTX doc:
    
    ```
    The .parity variant of the instructions test for the completion of the phase indicated 
    by the operand phaseParity, which is the integer parity of either the current phase or 
    the immediately preceding phase of the mbarrier object. An even phase has integer 
    parity 0 and an odd phase has integer parity of 1. So the valid values of phaseParity 
    operand are 0 and 1.
    ```
    See for more information:
    
    https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-mbarrier-test-wait-mbarrier-try-wait
    grypp authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    0a600c3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4588525 View commit details
    Browse the repository at this point in the history
  3. [clang][dataflow] Add Environment::initializeFieldsWithValues(). (#…

    …81239)
    
    This function will be useful when we change the behavior of record-type
    prvalues
    so that they directly initialize the associated result object. See also
    the
    comment here for more details:
    
    
    https://github.com/llvm/llvm-project/blob/9e73656af524a2c592978aec91de67316c5ce69f/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h#L354
    
    As part of this patch, we document and assert that synthetic fields may
    not have
    reference type.
    
    There is no practical use case for this: A `StorageLocation` may not
    have
    reference type, and a synthetic field of the corresponding non-reference
    type
    can serve the same purpose.
    martinboehme authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    270f2c5 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5b01522 View commit details
    Browse the repository at this point in the history
  5. [HWASAN] Update dbg.assign intrinsics in HWAsan pass (#79864)

    llvm.dbg.assign intrinsics have 2 {value, expression} pairs; fix hwasan to
    update the second expression.
    
    Fixes #76545. This is #78606 rebased and with the addition of DPValue handling.
    Note the addition of --try-experimental-debuginfo-iterators in the tests and
    some shuffling of code in MemoryTaggingSupport.cpp.
    OCHyams authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d860ea9 View commit details
    Browse the repository at this point in the history
  6. [InstCombine] Don't add fcmp instructions to strictfp functions (#81498)

    The strictfp attribute has the requirement that "LLVM will not introduce
    any new floating-point instructions that may trap". The llvm.is.fpclass
    intrinsic is documented as "The function never raises floating-point
    exceptions", and the fcmp instruction may raise one, so we can't
    transform the former into the latter in functions with the strictfp
    attribute.
    ostannard authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    44706bd View commit details
    Browse the repository at this point in the history
  7. Revert "[CVP] Check whether the default case is reachable (#79993)" (…

    …#81585)
    
    This reverts commit a034e65.
    
    Some protobuf users reported that this patch caused a significant
    compile-time regression because `TailDuplicator` works poorly with a
    specific pattern.
    
    We will reland it once the codegen issue is fixed.
    dtcxzyw authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    ca61e6a View commit details
    Browse the repository at this point in the history
  8. [clang-tidy] ignore local variable with [maybe_unused] attribute in b…

    …ugprone-unused-local-non-trivial-variable (#81563)
    HerrCai0907 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    ebe77cc View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    8c6e96d View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    f506192 View commit details
    Browse the repository at this point in the history
  11. [AMDGPU][NFC] Get rid of some operand decoders defined using macros. …

    …(#81482)
    
    Use templates instead.
    
    Part of <llvm/llvm-project#62629>.
    kosarev authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    4c93109 View commit details
    Browse the repository at this point in the history
  12. [lld] Add target support for SystemZ (s390x) (#75643)

    This patch adds full support for linking SystemZ (ELF s390x) object
    files. Support should be generally complete:
    - All relocation types are supported.
    - Full shared library support (DYNAMIC, GOT, PLT, ifunc).
    - Relaxation of TLS and GOT relocations where appropriate.
    - Platform-specific test cases.
    
    In addition to new platform code and the obvious changes, there were a
    few additional changes to common code:
    
    - Add three new RelExpr members (R_GOTPLT_OFF, R_GOTPLT_PC, and
    R_PLT_GOTREL) needed to support certain s390x relocations. I chose not
    to use a platform-specific name since nothing in the definition of these
    relocs is actually platform-specific; it is well possible that other
    platforms will need the same.
    
    - A couple of tweaks to TLS relocation handling, as the particular
    semantics of the s390x versions differ slightly. See comments in the
    code.
    
    This was tested by building and testing >1500 Fedora packages, with only
    a handful of failures; as these also have issues when building with LLD
    on other architectures, they seem unrelated.
    
    Co-authored-by: Tulio Magno Quites Machado Filho <tuliom@redhat.com>
    uweigand and tuliom authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    fe3406e View commit details
    Browse the repository at this point in the history
  13. [flang][Driver] Add -masm option to flang (#81490)

    The motivation here was a suggestion over in Compiler Explorer. You can
    use `-mllvm` already to do this but since gfortran supports `-masm`, I
    figured I'd try to add it.
    
    This is done by flang expanding `-masm` into `-mllvm x86-asm-syntax=`,
    then passing that to fc1. Which then collects all the `-mllvm` options
    and forwards them on.
    
    The code to expand it comes from clang `Clang::AddX86TargetArgs` (there
    are some other places doing the same thing too). However I've removed
    the `-inline-asm` that clang adds, as fortran doesn't have inline
    assembly.
    
    So `-masm` for flang purely changes the style of assembly output.
    
    ```
    $ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu
    <...>
            pushq   %rbp
    $ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu -masm=att
    <...>
            pushq   %rbp
    $ ./bin/flang-new /tmp/test.f90 -o - -S -target x86_64-linux-gnu -masm=intel
    <...>
            push    rbp
    ```
    
    The test is adapted from `clang/test/Driver/masm.c` by removing the
    clang-cl related lines and changing the 32 bit triples to 64 bit triples
    since flang doesn't support 32 bit targets.
    DavidSpickett authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    9ca1a15 View commit details
    Browse the repository at this point in the history
  14. [dataflow] CXXOperatorCallExpr equal operator might not be a glvalue …

    …(#80991)
    
    Although in a normal implementation the assumption is reasonable, it
    seems that some esoteric implementation are not returning a T&. This
    should be handled correctly and the values be propagated.
    
    ---------
    
    Co-authored-by: martinboehme <mboehme@google.com>
    paulsemel and martinboehme authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    a8fb0dc View commit details
    Browse the repository at this point in the history
  15. [mlir][VectorOps] Add conversion of 1-D vector.interleave ops to LLVM…

    … (#80966)
    
    The 1-D case directly maps to LLVM intrinsics. The n-D case will be
    handled by unrolling to 1-D first (in a later patch).
    
    Depends on: #80965
    MacDue authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    79ce2c9 View commit details
    Browse the repository at this point in the history
  16. [gn build] Port fe3406e

    llvmgnsyncbot committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    e678e6e View commit details
    Browse the repository at this point in the history
  17. [ADT] Allow std::next to work on BitVector's set_bits_iterator (#80830)

    Without this I would hit errors with libstdc++-12 like:
    
    /usr/include/c++/12/bits/stl_iterator_base_funcs.h:230:5: note:
    candidate template ignored: substitution failure [with _InputIterator =
    llvm::const_set_bits_iterator_impl<llvm::BitVector>]: argument may not
    have 'void' type
        next(_InputIterator __x, typename
        ^
    jayfoad authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    8456e0c View commit details
    Browse the repository at this point in the history
  18. [mlir][openmp] - Add the depend clause to omp.target and related offl…

    …oading directives (#81081)
    
    This patch adds support for the depend clause in a number of OpenMP
    directives/constructs related to offloading. Specifically, it adds the
    handling of the depend clause when it is used with the following
    constructs
    
    - target
    - target enter data
    - target update data
    - target exit data
    bhandarkar-pranav authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    55d6643 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    e79ad7b View commit details
    Browse the repository at this point in the history
  20. [RemoveDIs][ValueMapper] Remap DIAssignIDs in DPValues (#81595)

    Fix crash raised in comments for 5c9f768
    OCHyams authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    97088b2 View commit details
    Browse the repository at this point in the history
  21. [mlir][linalg] Document ops not supported by the vectoriser (nfc) (#8…

    …1500)
    
    Adds a test to help document Linalg Ops that are currently not supported
    by the vectoriser (i.e. the logic to vectorise these is missing). The
    list is not exhaustive.
    banach-space authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    bfc0b7c View commit details
    Browse the repository at this point in the history
  22. [mlir][vector] ND vectors linearization pass (#81159)

    Common backends (LLVM, SPIR-V) only supports 1D vectors, LLVM conversion
    handles ND vectors (N >= 2) as `array<array<... vector>>` and SPIR-V
    conversion doesn't handle them at all at the moment. Sometimes it's
    preferable to treat multidim vectors as linearized 1D. Add pass to do
    this. Only constants and simple elementwise ops are supported for now.
    
    @krzysz00 I've extracted yours result type conversion code from
    LegalizeToF32 and moved it to common place.
    
    Also, add ConversionPattern class operating on traits.
    Hardcode84 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    35ef399 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    990896a View commit details
    Browse the repository at this point in the history
  24. [RISCV] Fix assertion in lowerEXTRACT_SUBVECTOR

    This fixes a crash when lowering an extract_subvector like:
    
    t0:v1i64 = extract_subvector t1:v2i64, 1
    
    Whilst we never need a vslidedown with M1 on scalable vector types, we might
    need to do it for v1i64/v1f64, since the smallest container type for it is
    nxv1i64/nxv1f64.
    
    The lowering code is still correct for this case, but the assertion was too
    strict. The actual invariant we're relying on is that ContainerSubVecVT's LMUL
    <= M1, not < M1. Hence why we handled v2i32 fine, because its container type
    was nxv1i32 and MF2.
    lukel97 committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    208edf7 View commit details
    Browse the repository at this point in the history
  25. [clang][Interp] Handle CXXUuidofExprs

    Allocate storage and initialize it with the given APValue contents.
    tbaederr committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    9b718c0 View commit details
    Browse the repository at this point in the history
  26. [SystemZ][z/OS][libcxx] mark aligned allocation tests XFAIL on z/OS (…

    …#80735)
    
    zOS doesn't support aligned allocation, so mark these testcases as
    unsupported.
    
    Continuation of https://reviews.llvm.org/D102798
    abhina-sree authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    a70077e View commit details
    Browse the repository at this point in the history
  27. [MC/DC] Refactor: Make MCDCParams as std::variant (#81227)

    Introduce `mcdc::DecisionParameters` and `mcdc::BranchParameters` and make
    sure them not initialized as zero.
    
    FIXME: Could we make `CoverageMappingRegion` as a smart tagged union?
    chapuni authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    a17a3e9 View commit details
    Browse the repository at this point in the history
  28. [TableGen] Use vectors instead of sets for testing intersection. NFC.…

    … (#81602)
    
    In a few places we test whether sets (i.e. sorted ranges) intersect by
    computing the set_intersection and then testing whether it is empty. For
    this purpose it should be more efficient to use a std:vector instead of
    a std::set to hold the result of the set_intersection, since insertion
    is simpler.
    jayfoad authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    880afa1 View commit details
    Browse the repository at this point in the history
  29. [clang][Interp] Handle Requires- and ConceptSpecializationExprs

    Just emit their satisfaction state, which is what the current
    interpreter does as well.
    tbaederr committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    bb60c06 View commit details
    Browse the repository at this point in the history
  30. [OpenACC] Implement AST for OpenACC Compute Constructs (#81188)

    'serial', 'parallel', and 'kernel' constructs are all considered
    'Compute' constructs. This patch creates the AST type, plus the required
    infrastructure for such a type, plus some base types that will be useful
    in the future for breaking this up.
    
    The only difference between the three is the 'kind'( plus some minor
     clause legalization rules, but those can be differentiated easily
    enough), so rather than representing them as separate AST nodes, it
    seems
    to make sense to make them the same.
    
    Additionally, no clause AST functionality is being implemented yet, as
    that fits better in a separate patch, and this is enough to get the
    'naked' constructs implemented.
    
    This is otherwise an 'NFC' patch, as it doesn't alter execution at all,
    so there aren't any tests.  I did this to break up the review workload
    and to get feedback on the layout.
    erichkeane authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    f655778 View commit details
    Browse the repository at this point in the history
  31. [gn build] Port f655778

    llvmgnsyncbot committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    af56bea View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    742ec3a View commit details
    Browse the repository at this point in the history
  33. [Object][COFF][NFC] Make writeImportLibrary NativeExports argument op…

    …tional. (#81600)
    
    It's not interesting for majority of downstream users.
    cjacek authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    4612208 View commit details
    Browse the repository at this point in the history
  34. Reapply "[DebugInfo][RemoveDIs] Turn on non-instrinsic debug-info by …

    …default"
    
    This reapplies commit bdde5f9 by undoing the revert bc66e0c.
    
    The previous reapplication 5c9f768 was reverted due to a crash
    (reproducer in comments for 5c9f768) which was fixed in #81595.
    
    As noted in the original commit, this commit may break downstream tests.
    If this commit is breaking your downstream tests, please see comment 12 in
    [0], which documents the kind of variation in tests we'd expect to see from
    this change and what to do about it.
    
    [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
    OCHyams committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d759618 View commit details
    Browse the repository at this point in the history
  35. [TableGen] Use std::move instead of swap. NFC. (#81606)

    Historically TableGen has used `A.swap(B)` to move containers without
    the expense of copying them. Perhaps this predated rvalue references. In
    any case `A = std::move(B)` seems like a more direct way to implement
    this when only A is required after the operation.
    jayfoad authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    f7cddf8 View commit details
    Browse the repository at this point in the history
  36. Fix warning by removing unused variable (#81604)

    Apparently, some compilers [correctly] warn that the variable that was
    created prior to this change is unused.
    
    This reemoves the variable.
    Leporacanthicus authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d1f510c View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    5e5e51e View commit details
    Browse the repository at this point in the history
  38. [GitHub][workflows] Ask reviewers to merge PRs when author cannot (#8…

    …1142)
    
    This uses
    https://pygithub.readthedocs.io/en/stable/github_objects/Repository.html?highlight=get_collaborator_permission#github.Repository.Repository.get_collaborator_permission.
    
    Which does
    https://docs.github.com/en/rest/collaborators/collaborators?apiVersion=2022-11-28#get-repository-permissions-for-a-user
    and returns the top level "permission" key.
    
    This is less detailed than the user/permissions key but should be fine
    for this
    use case.
    
    When a review is submitted we check:
    * If it's an approval.
    * Whether we have already left a merge on behalf comment (by looking for
    a hidden HTML comment).
    * Whether the author has permissions to merge their own PR. 
    * Whether the reviewer has permissions to merge.
    
    If needed we leave a comment tagging the reviewer. If the reviewer also
    doesn't have merge permission, then it asks them to find someone else
    who does.
    DavidSpickett authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    38c706e View commit details
    Browse the repository at this point in the history
  39. [ARM] __ARM_ARCH macro definition fix (#81493)

    This patch changes how the macro __ARM_ARCH is defined to match its
    defintion in the ACLE. In ACLE 5.4.1, __ARM_ARCH is defined as equal to
    the major architecture version for ISAs up to and including v8. From
    v8.1 onwards, its definition is changed to include minor versions, such
    that for an architecture vX.Y, __ARM_ARCH = X*100 + Y. Before this
    patch, LLVM defined __ARM_ARCH using only the major architecture version
    for all architecture versions. This patch adds functionality to define
    __ARM_ARCH correctly for architectures greater than or equal to v8.1.
    jwestwood921 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    89c1bf1 View commit details
    Browse the repository at this point in the history
  40. [DAGCombine] Fix multi-use miscompile in load combine (#81586)

    The load combine replaces a number of original loads with one new loads
    and also replaces the output chains of the original loads with the
    output chain of the new load. This is incorrect if the original load is
    retained (due to multi-use), as it may get incorrectly reordered.
    
    Fix this by using makeEquivalentMemoryOrdering() instead, which will
    create a TokenFactor with both chains.
    
    Fixes llvm/llvm-project#80911.
    nikic authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    25b9ed6 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    4ad9f5b View commit details
    Browse the repository at this point in the history
  42. Configuration menu
    Copy the full SHA
    192c23b View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    485ebbf View commit details
    Browse the repository at this point in the history
  44. [NFC][LLVM][AsmWriter] Extract logic to write out ConstantFP from Wri…

    …teConstantInternal.
    
    This makes is easier to extend the code to support vector types.
    paulwalker-arm committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    4f13f35 View commit details
    Browse the repository at this point in the history
  45. [Flang] Add __powerpc__ macro to set c_intmax_t to c_int64_t rather t…

    …han c_int128_t as PowerPC only supports up to c_int64_t. (#81222)
    
    PowerPC only supports up to `c_int64_t`. Add macro `__powerpc__` and
    preprocess it for setting `c_intmax_t` in `iso_c_binding` intrinsic
    module.
    DanielCChen authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    987258f View commit details
    Browse the repository at this point in the history
  46. [clang][Driver][HLSL] Fix formatting of clang-dxc options group title

    Some extra `<>` and a missing full stop.
    DavidSpickett committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    381a00d View commit details
    Browse the repository at this point in the history
  47. [LLVM] Add __builtin_readsteadycounter intrinsic and builtin for re…

    …altime clocks (#81331)
    
    Summary:
    This patch adds a new intrinsic and builtin function mirroring the
    existing `__builtin_readcyclecounter`. The difference is that this
    implementation targets a separate counter that some targets have which
    returns a fixed frequency clock that can be used to determine elapsed
    time, this is different compared to the cycle counter which often has
    variable frequency.
    
    This patch only adds support for the NVPTX and AMDGPU targets.
    
    This is done as a new and separate builtin rather than an argument to
    `readcyclecounter` to avoid needing to change existing code and to make
    the separation more explicit.
    jhuber6 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    11fcae6 View commit details
    Browse the repository at this point in the history
  48. [TableGen] Do not speculatively grow RegUnitSets. NFC.

    This seems to be a trick to avoid copying a RegUnitSet, but it can be
    done more simply using std::move.
    jayfoad committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    1f90af1 View commit details
    Browse the repository at this point in the history
  49. [DirectX][NFC] Change specification of overload types and attribute i…

    …n DXIL.td (#81184)
    
    - Specify overload types of DXIL Operation as list of types instead of a
    string.
    - Add supported DXIL type record definitions to `DXIL.td` leveraging
    `LLVMType` to avoid duplicate definitions.
     - Spell out DXIL Operation Attribute specification string.
     - Make corresponding changes to process the records in DXILEmitter.cpp
    bharadwajy authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    8ba4ff3 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    1d84792 View commit details
    Browse the repository at this point in the history
  51. Merge from 'sycl' to 'sycl-web'

    iclsrc committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    a23c262 View commit details
    Browse the repository at this point in the history
  52. [lldb-dap][NFC] Add Breakpoint struct to share common logic. (#80753)

    This adds a layer between `SounceBreakpoint`/`FunctionBreakpoint` and
    `BreakpointBase` to have better separation and encapsulation so we are
    not directly operating on `SBBreakpoint`.
    
    I basically moved the `SBBreakpoint` and the methods that requires it
    from `BreakpointBase` to `Breakpoint`. This allows adding support for
    data watchpoint easier by sharing the logic inside `BreakpointBase`.
    ZequanWu authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d58c128 View commit details
    Browse the repository at this point in the history
  53. [clang][docs] Fix warning in LanguageExtensions

    build-llvm/tools/clang/docs/LanguageExtensions.rst:2768: WARNING: Title underline too short.
    DavidSpickett committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    7a5c1a4 View commit details
    Browse the repository at this point in the history
  54. [mlir][nfc] Add tests for linalg.mmt4d (#81422)

    linalg.mmt4d was added a while back (https://reviews.llvm.org/D105244),
    but there are virtually no tests in-tree. In the spirit of documenting
    through test, this PR adds a few basic examples.
    banach-space authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    7a47113 View commit details
    Browse the repository at this point in the history
  55. [libc] Rework the RPC interface to accept runtime wave sizes (#80914)

    Summary:
    The RPC interface needs to handle an entire warp or wavefront at once.
    This is currently done by using a compile time constant indicating the
    size of the buffer, which right now defaults to some value on the client
    (GPU) side. However, there are currently attempts to move the `libc`
    library to a single IR build. This is problematic as the size of the
    wave fronts changes between ISAs on AMDGPU. The builitin
    `__builtin_amdgcn_wavefrontsize()` will return the appropriate value,
    but it is only known at runtime now.
    
    In order to support this, this patch restructures the packet. Now
    instead of having an array of arrays, we simply have a large array of
    buffers and slice it according to the runtime value if we don't know it
    ahead of time. This also somewhat has the advantage of making the buffer
    contiguous within a page now that the header has been moved out of it.
    jhuber6 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    f879ac0 View commit details
    Browse the repository at this point in the history
  56. [flang][cuda] Lower launch_bounds values (#81537)

    This PR adds a new attribute to carry over the information from
    `launch_bounds`. The new attribute `CUDALaunchBoundsAttr` holds 2 to 3
    integer attrinbutes and is added to `func.func` operation.
    clementval authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d79c3c5 View commit details
    Browse the repository at this point in the history
  57. [libc] Round up time for GPU nanosleep implementation (#81630)

    Summary:
    The GPU `nanosleep` tests would occasionally fail. This was due to the
    fact that we used integer division to determine how many ticks we had to
    sleep for. This would then truncate, leaving us with a value just
    slightly below the requested value. This would then occasionally leave
    us with a return value of `-1`. This patch just changes the code to
    round up by 1 so we always sleep for at least the requested value.
    jhuber6 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    1dacfd1 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    e847abc View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    a7cebad View commit details
    Browse the repository at this point in the history
  60. [IRGen][AArch64][RISCV] Generalize bitcast between i1 predicate vecto…

    …r and i8 fixed vector. (#76548)
    
    Instead of only handling vscale x 16 x i1 predicate vectors, handle any
    scalable i1 vector where the known minimum is divisible by 8.
    
    This is used on RISC-V where we have multiple sizes of predicate
    types.
    topperc authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    9be7b0a View commit details
    Browse the repository at this point in the history
  61. [clang] Remove #undef alloca workaround (#81534)

    Added in 26670dc to workaround intel#4885.
    
    Windows CI and a local Windows build are happy with this change, so it
    seems like this has been properly fixed at some point. If this does
    break somebody, this can be easily reverted. (Also, Linux does the same
    `#define alloca` in system headers, so I'm not sure why it'd be
    different on Windows)
    
    This is tech debt that caused breakages, see comments on #71709.
    aeubanks authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    742a06f View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    9838c85 View commit details
    Browse the repository at this point in the history
  63. [RISCV] Enable the TypePromotion pass from AArch64/ARM.

    This pass looks for unsigned icmps that have illegal types and tries
    to widen the use/def graph to improve the placement of the zero
    extends that type legalization would need to insert.
    
    I've explicitly disabled it for i32 by adding a check for
    isSExtCheaperThanZExt to the pass.
    
    The generated code isn't perfect, but my data shows a net
    dynamic instruction count improvement on spec2017 for both base and
    Zba+Zbb+Zbs.
    topperc committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    7d40ea8 View commit details
    Browse the repository at this point in the history
  64. [flang][cuda] Lower cluster_dims values (#81636)

    This PR adds a new attribute to carry over the information from
    `cluster_dims`. The new attribute `CUDAClusterDimsAttr` holds 3 integer
    attributes and is added to `func.func` operation.
    clementval authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    5e3c7e3 View commit details
    Browse the repository at this point in the history
  65. Configuration menu
    Copy the full SHA
    502a88b View commit details
    Browse the repository at this point in the history
  66. [libc] Remove remaining GPU architecture dependent instructions (#81612)

    Summary:
    Recent patches have added solutions to the remaining sources of
    divergence. This patch simply removes the last occures of things like
    `has_builtin`, `ifdef` or builtins with feature requirements. The one
    exception here is `nanosleep`, but I made changes in the
    `__nvvm_reflect` pass to make usage like this actually work at O0.
    
    Depends on llvm/llvm-project#81331
    jhuber6 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    63198e0 View commit details
    Browse the repository at this point in the history
  67. Merge from 'main' to 'sycl-web' (110 commits)

      CONFLICT (content): Merge conflict in clang/include/clang/Serialization/ASTBitCodes.h
    jyu2-git committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    6eae6b9 View commit details
    Browse the repository at this point in the history
  68. [mlir][ROCDL] Add synchronization primitives (#80888)

    This PR adds two LLVM intrinsics to MLIR:
    - llvm.amdgcn.s.setprio which sets the priority of a wave for the GPU
    scheduler
    - llvm.amdgcn.sched.barrier which sets a software barrier so that the
    scheduler cannot move instructions around
    giuseros authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    16140ff View commit details
    Browse the repository at this point in the history
  69. [libc] Remove leftover target dependent intrinsic

    Summary:
    I forgot to remove these because I thought I did it already. This caused
    the build to fail when actually linked.
    jhuber6 committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    c830c12 View commit details
    Browse the repository at this point in the history
  70. [NFC][InstrProf]Factor out getCanonicalName to compute the canonical …

    …name given a pgo name. (#81547)
    
    - Also update the `InstrProf::addFuncWithName` to call the newly added
    `getCanonicalName`.
    minglotus-6 authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    2422e96 View commit details
    Browse the repository at this point in the history
  71. [InstCombine] Extend (lshr/shl (shl/lshr -1, x), x) -> `(lshr/shl -…

    …1, x)` for multi-use
    
    We previously did this iff the inner `(shl/lshr -1, x)` was
    one-use. No instructions are added even if the inner `(shl/lshr -1,
    x)` is multi-use and this canonicalization both makes the resulting
    instruction easier to analyze and shrinks its dependency chain.
    
    Closes #81576
    goldsteinn committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    79ce933 View commit details
    Browse the repository at this point in the history
  72. Revert "[clang] Remove #undef alloca workaround" (#81649)

    Reverts llvm/llvm-project#81534
    
    llvm/llvm-project#81534 breaks building (Fuchsia) Clang toolchain on
    Windows.
    
    Log:
    https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8756186536543250705/+/u/clang/install/stdout
    Builder:
    https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8756186536543250705/overview
    
    ```
    FAILED: tools/clang/tools/extra/clang-include-fixer/tool/CMakeFiles/clang-include-fixer.dir/ClangIncludeFixer.cpp.obj 
    C:\b\s\w\ir\x\w\cipd\bin\clang-cl.exe  /nologo -TP -DCLANG_REPOSITORY_STRING=\"https://llvm.googlesource.com/llvm-project\" -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GLIBCXX_ASSERTIONS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -IC:\b\s\w\ir\x\w\llvm_build\tools\clang\tools\extra\clang-include-fixer\tool -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang\include -IC:\b\s\w\ir\x\w\llvm_build\tools\clang\include -IC:\b\s\w\ir\x\w\recipe_cleanup\tensorflow-venv\store\python_venv-q9i5kpsp0iun0ktmqgab125ti8\contents\Lib\site-packages\tensorflow\include -IC:\b\s\w\ir\x\w\llvm_build\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\llvm\include -IC:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\.. -imsvcC:\b\s\w\ir\x\w\zlib_install_target\include -imsvcC:\b\s\w\ir\x\w\zstd_install\include /DWIN32 /D_WINDOWS   /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw -no-canonical-prefixes /O2 /Ob2  -std:c++17 -MT  /EHs-c- /GR- -UNDEBUG /showIncludes /Fotools\clang\tools\extra\clang-include-fixer\tool\CMakeFiles\clang-include-fixer.dir\ClangIncludeFixer.cpp.obj /Fdtools\clang\tools\extra\clang-include-fixer\tool\CMakeFiles\clang-include-fixer.dir\ -c -- C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\ClangIncludeFixer.cpp
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\ClangIncludeFixer.cpp:11:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang-tools-extra\clang-include-fixer\tool\..\IncludeFixer.h:15:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Sema/ExternalSemaSource.h:15:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/ExternalASTSource.h:18:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/DeclBase.h:18:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/AST/DeclarationName.h:18:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/IdentifierTable.h:18:
    In file included from C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h:63:
    C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(151,1): error: redefinition of enumerator 'BI_alloca'
      151 | LANGBUILTIN(_alloca, "v*z", "n", ALL_MS_LANGUAGES)
          | ^
    C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(15,54): note: expanded from macro 'LANGBUILTIN'
       15 | #  define LANGBUILTIN(ID, TYPE, ATTRS, BUILTIN_LANG) BUILTIN(ID, TYPE, ATTRS)
          |                                                      ^
    C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h(62,34): note: expanded from macro 'BUILTIN'
       62 | #define BUILTIN(ID, TYPE, ATTRS) BI##ID,
          |                                  ^
    <scratch space>(72,1): note: expanded from here
       72 | BI_alloca
          | ^
    C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(150,1): note: previous definition is here
      150 | LIBBUILTIN(alloca, "v*z", "fn", STDLIB_H, ALL_GNU_LANGUAGES)
          | ^
    C:\b\s\w\ir\x\w\llvm_build\tools\clang\include\clang/Basic/Builtins.inc(11,61): note: expanded from macro 'LIBBUILTIN'
       11 | #  define LIBBUILTIN(ID, TYPE, ATTRS, HEADER, BUILTIN_LANG) BUILTIN(ID, TYPE, ATTRS)
          |                                                             ^
    C:\b\s\w\ir\x\w\llvm-llvm-project\clang\include\clang/Basic/Builtins.h(62,34): note: expanded from macro 'BUILTIN'
       62 | #define BUILTIN(ID, TYPE, ATTRS) BI##ID,
          |                                  ^
    <scratch space>(71,1): note: expanded from here
       71 | BI_alloca
          | ^
    ```
    Prabhuk authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    f79f58d View commit details
    Browse the repository at this point in the history
  73. [StatepointLowering] Use Constant instead of TargetConstant for undef…

    … value (#81635)
    
    Prevents isel errors when trying to lower gc relocate of undef value
    (which turns into CopyToReg of TargetConstant). Such relocates may occur
    after DCE (e.g. after GVN removes some dead blocks) if there are not
    passes like instcombine scheduled after to clean them up.
    
    Fixes #80294
    
    ---------
    
    Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
    danilaml and arsenm authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    e20462a View commit details
    Browse the repository at this point in the history
  74. InstCombine: Enable SimplifyDemandedUseFPClass and remove flag (#81108)

    This completes the unrevert of ef38833.
    arsenm authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    9dd2c59 View commit details
    Browse the repository at this point in the history
  75. [libc++][modules] Re-add build dir CMakeLists.txt. (#81370)

    This CMakeLists.txt is used to build modules without build system
    support. This was removed in d06ae33.
    This is used in the documentation how to use modules.
    
    Made some minor changes to make it work with the std.compat module using
    the std module.
    
    Note the CMakeLists.txt in the build dir should be removed once build
    system support is generally available.
    mordante authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    fc0e9c8 View commit details
    Browse the repository at this point in the history
  76. Don't count all the frames just to skip the current inlined ones. (#8…

    …0918)
    
    The algorithm to find the DW_OP_entry_value requires you to find the
    nearest non-inlined frame. It did that by counting the number of stack
    frames so that it could use that as a loop stopper.
    
    That is unnecessary and inefficient. Unnecessary because GetFrameAtIndex
    will return a null frame when you step past the oldest frame, so you
    already have the "got to the end" signal without counting all the stack
    frames.
    And counting all the stack frames can be expensive.
    jimingham authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    a04c636 View commit details
    Browse the repository at this point in the history
  77. Add the ability to define a Python based command that uses CommandObj…

    …ectParsed (#70734)
    
    This allows you to specify options and arguments and their definitions
    and then have lldb handle the completions, help, etc. in the same way
    that lldb does for its parsed commands internally.
    
    This feature has some design considerations as well as the code, so I've
    also set up an RFC, but I did this one first and will put the RFC
    address in here once I've pushed it...
    
    Note, the lldb "ParsedCommand interface" doesn't actually do all the
    work that it should. For instance, saying the type of an option that has
    a completer doesn't automatically hook up the completer, and ditto for
    argument values. We also do almost no work to verify that the arguments
    match their definition, or do auto-completion for them. This patch
    allows you to make a command that's bug-for-bug compatible with built-in
    ones, but I didn't want to stall it on getting the auto-command checking
    to work all the way correctly.
    
    As an overall design note, my primary goal here was to make an interface
    that worked well in the script language. For that I needed, for
    instance, to have a property-based way to get all the option values that
    were specified. It was much more convenient to do that by making a
    fairly bare-bones C interface to define the options and arguments of a
    command, and set their values, and then wrap that in a Python class
    (installed along with the other bits of the lldb python module) which
    you can then derive from to make your new command. This approach will
    also make it easier to experiment.
    
    See the file test_commands.py in the test case for examples of how this
    works.
    jimingham authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    a69ecb2 View commit details
    Browse the repository at this point in the history
  78. [mlir][flang][openmp] Rework wsloop reduction operations (#80019)

    This patch reworks the way that wsloop reduction operations function to
    better match the expected semantics from the OpenMP specification,
    following the rework of parallel reductions.
    
    The new semantics create a private reduction variable as a block
    argument which should be used normally for all operations on that
    variable in the region; this private variable is then combined with the
    others into the shared variable. This way no special omp.reduction
    operations are needed inside the region. These block arguments follow
    the loop control block arguments.
    
    ---------
    
    Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
    DavidTruby and kiranchandramohan authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    be9f8ff View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    3985eda View commit details
    Browse the repository at this point in the history
  80. [SeparateConstOffsetFromGEP] Reorder trivial GEP chains to separate c…

    …onstants (#73056)
    
    In this case, a trivial GEP chain has the form:
    
    ```
    %ptr = getelementptr sameType, %base, constant
    %val = getelementptr sameType, %ptr, %variable
    ```
    
    That is, a one-index GEP consumes another (of the same basis and result
    type) one-index GEP, where the inner GEP uses a constant index and the
    outer GEP uses a variable index. For chains of this type, it is trivial
    to reorder them (by simply swapping the indexes). The result of doing so
    is better AddrMode matching for users of the ultimate ptr produced by
    GEP chain.
    
    Future patches can extend this to support non-trivial GEP chains (e.g.
    those with different basis types and/or multiple indices).
    jrbyrnes authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    1b65742 View commit details
    Browse the repository at this point in the history
  81. [Clang][Sema] Diagnose friend declarations with enum elaborated-type-…

    …specifier in all language modes (#80171)
    
    According to [dcl.type.elab] p4:
    > If an _elaborated-type-specifier_ appears with the `friend` specifier
    as an entire _member-declaration_, the _member-declaration_ shall have
    one of the following forms:
    >     `friend` _class-key_ _nested-name-specifier_(opt) _identifier_ `;`
    >     `friend` _class-key_ _simple-template-id_ `;`
    > `friend` _class-key_ _nested-name-specifier_ `template`(opt)
    _simple-template-id_ `;`
    
    Notably absent from this list is the `enum` form of an
    _elaborated-type-specifier_ "`enum` _nested-name-specifier_(opt)
    _identifier_", which appears to be intentional per the resolution of
    CWG2363.
    
    Most major implementations accept these declarations, so the diagnostic
    is a pedantic warning across all C++ versions.
    
    In addition to the trivial cases previously diagnosed in C++98, we now
    diagnose cases where the _elaborated-type-specifier_ has a dependent
    _nested-name-specifier_:
    ```
    template<typename T>
    struct A
    {
        enum class E;
    };
    
    struct B
    {
        template<typename T>
        friend enum A<T>::E; // pedantic warning: elaborated enumeration type cannot be a friend
    };
    
    template<typename T>
    struct C
    {
        friend enum T::E;  // pedantic warning: elaborated enumeration type cannot be a friend
    };
    ```
    sdkrystian authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    3a48630 View commit details
    Browse the repository at this point in the history
  82. Configuration menu
    Copy the full SHA
    2772692 View commit details
    Browse the repository at this point in the history
  83. [mlir] Fix a warning

    This patch fixes:
    
      mlir/lib/Target/LLVMIR/AttrKindDetail.h:65:1: error: unused function
      'getAttrNameToKindMapping' [-Werror,-Wunused-function]
    kazutakahirata committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    f5cc961 View commit details
    Browse the repository at this point in the history
  84. [SeparateConstOffsetFromGEP] Fix test after 1b65742

    Change-Id: I7ced7774c80997d21969ab7886fc30c0c1e1cc81
    jrbyrnes committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    ec0aa16 View commit details
    Browse the repository at this point in the history
  85. Merge from 'sycl' to 'sycl-web' (4 commits)

    iclsrc committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    ca8cb53 View commit details
    Browse the repository at this point in the history
  86. [OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two me…

    …mbers swapped for big-endian (#79188)
    
    The direct lock data structure has bit `0` (the least significant bit)
    of the first 32-bit word set to `1` to indicate it is a direct lock. On
    the other hand, the first word (in 32-bit mode) or first two words (in
    64-bit mode) of an indirect lock are the address of the entry allocated
    from the indirect lock table. The runtime checks bit `0` of the first
    32-bit word to tell if this is a direct or an indirect lock. This works
    fine for 32-bit and 64-bit little-endian because its memory layout of a
    64-bit address is (`low word`, `high word`). However, this causes
    problems for big-endian where the memory layout of a 64-bit address is
    (`high word`, `low word`). If an address of the indirect lock table
    entry is something like `0x110035300`, i.e., (`0x1`, `0x10035300`), it
    is treated as a direct lock. This patch defines `struct
    kmp_base_tas_lock` with the ordering of the two 32-bit members flipped
    for big-endian PPC64 so that when checking/setting tags in member
    `poll`, the second word (the low word) is used. This patch also changes
    places where `poll` is not already explicitly specified for
    checking/setting tags.
    xingxue-ibm authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    ac97562 View commit details
    Browse the repository at this point in the history
  87. [Sparc] limit MaxAtomicSizeInBitsSupported to 32 for 32-bit Sparc. (#…

    …81655)
    
    When in 32-bit mode, the backend doesn't currently implement 64-bit
    atomics, even though the hardware is capable if you have specified a V9
    CPU. Thus, limit the width to 32-bit, for now, leaving behind a TODO.
    
    This fixes a regression triggered by PR #73176.
    jyknight authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    c1a99b2 View commit details
    Browse the repository at this point in the history
  88. [TypePromotion] Remove an unreachable 'return false'. NFC

    The if and the else above this both return so this is unreachable.
    Delete it and remove the else after return.
    topperc committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d0a1bf8 View commit details
    Browse the repository at this point in the history
  89. [libc] Allow BigInt class to use base word types other than uint64_t.…

    … (#81634)
    
    This will allow DyadicFloat class to replace NormalFloat class.
    lntue authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    4e00551 View commit details
    Browse the repository at this point in the history
  90. Temporarily disable the TestAddParsedCommand.py while I figure out

    why it's crashing on the x86_64 Debian Linux worker.
    jimingham committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    f0b271e View commit details
    Browse the repository at this point in the history
  91. [mlir][sparse] add assemble test for Batched-CSR and CSR-Dense (#81660)

    These are formats supported by PyTorch sparse, so good to make sure that
    our assemble instructions work on these.
    aartbik authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    2400f70 View commit details
    Browse the repository at this point in the history
  92. [DWARFDump] Make --verify handle all sections by default (#81559)

    The current behavior of --verify is that it only verifies debug_info,
    debug_abbrev and debug_names. This seems fairly arbitrary and might have
    been unintentional, as originally the absence of any section flags
    implied "all".
    
    This patch changes the behavior so that the verifier now verifies
    everything by default. It revealed two tests that had potentially
    invalid DWARF:
    
    1. dwarfdump-str-offsets.s is adding padding between two
    debug_str_offset contributions. The standard does not explicitly allow
    this behavior. See issue
    llvm/llvm-project#81558
    
    2. dwarf5-macro.test uses a checked-in binary that has invalid
    debug_str_offsets. One of its entries points to the _middle_ of the
    string section:
    
    error: .debug_str_offsets: contribution 0x0: index 0x4: invalid string
    offset *0x18 == 0x455D, is neither zero nor immediately following a null
    character
    
    If we look at the closest offset to 0x455D in debug_str:
    
    ```
    0x0000454e: "__SLONG32_TYPE int"
    ```
    
    0x455D points to "int".
    felipepiovezan authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    5296149 View commit details
    Browse the repository at this point in the history
  93. [lldb][DWARFIndex] Use IDX_parent to implement GetFullyQualifiedType …

    …query (#79932)
    
    This commit changes DebugNamesDWARFIndex so that it now overrides
    `GetFullyQualifiedType` and attempts to use DW_IDX_parent, when
    available, to speed up such queries. When this type of information is
    not available, the base-class implementation is used.
    
    With this commit, we now achieve the 4x speedups reported in [1].
    
    [1]:
    https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/38
    felipepiovezan authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    91f4a84 View commit details
    Browse the repository at this point in the history
  94. [DebugInfo][RemoveDIs] Convert back to intrinsic form for ThinLTO

    As explained on discourse [0] (comment 12), to get the non-intrinsic form
    of debug-info records enabled and testing, we're only using it inside of
    the pass manager in LLVM right now. Things like the textual IR writer and
    bitcode writing _passes_ are instrumented to convert back to
    intrinsic-form when writing a module out, but it turns out we missed the
    ThinLTO bitcode writing pass. That causes uh, all variable location
    debug-info to be dropped in ThinLTO mode (oops).
    
    This patch adds that conversion; it should be low risk as it's identical to
    what happens in all the other passes. However should this commit turn out
    to cause trouble, please instead revert d759618 or whichever is the
    most recent commit to set UseNewDbgInfoFormat to default to true. That'll
    revert LLVM back to the definitely-correct behaviour.
    
    [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
    jmorse committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    fa77e1f View commit details
    Browse the repository at this point in the history
  95. Revert "[SeparateConstOffsetFromGEP] Reorder trivial GEP chains to se…

    …parate constants (#73056)" and follow ups
    
    "ninja check-llvm" is failing on tip of tree.
    
    This reverts commit ec0aa16.
    This reverts commit 1b65742.
    preames committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    99c5a66 View commit details
    Browse the repository at this point in the history
  96. [lldb-dap] Add support for data breakpoint. (#81541)

    This implements functionality to handle `DataBreakpointInfo` request and
    `SetDataBreakpoints` request.
    
    If variablesReference is 0 or not provided, interpret name as ${number
    of bytes}@${expression} to set data breakpoint at the given expression
    because the spec
    https://microsoft.github.io/debug-adapter-protocol/specification#Requests_DataBreakpointInfo
    doesn't say how the client could specify the number of bytes to watch.
    
    This is based on top of llvm/llvm-project#80753.
    ZequanWu authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    8c56e78 View commit details
    Browse the repository at this point in the history
  97. Merge from 'main' to 'sycl-web' (32 commits)

      CONFLICT (content): Merge conflict in llvm/test/CodeGen/RISCV/O3-pipeline.ll
    
    Also revert 5c9f768.
    
    See:KhronosGroup/SPIRV-LLVM-Translator#2357
    intel#12698
    jyu2-git committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    735e88e View commit details
    Browse the repository at this point in the history
  98. [WebAssembly] Demote PHIs in catchswitch BB only (#81570)

    `DemoteCatchSwitchPHIOnly` option in `WinEHPrepare` pass was added in
    llvm/llvm-project@99d60e0,
    because Wasm EH uses `WinEHPrepare`, but it doesn't need to demote all
    PHIs. PHIs in `catchswitch` BBs have to be removed (= demoted) because
    `catchswitch`s are removed in ISel and `catchswitch` BBs are removed as
    well, so they can't have other instructions.
    
    But because Wasm EH doesn't use funclets, so PHIs in `catchpad` or
    `cleanuppad` BBs don't need to be demoted. That was the reason
    `DemoteCatchSwitchPHIOnly` option was added, in order not to demote more
    instructions unnecessarily.
    
    The problem is it should have been set to `true` for Wasm EH. (Its
    default value is `false` for WinEH) And I mistakenly set it to `false`
    and wasn't aware about this for more than 5 years. This was not the end
    of the world; it just means we've been demoting more instructions than
    we should, possibly huting code size. In practice I think it would've
    had hardly any effect in real performance given that the occurrence of
    PHIs in `catchpad` or `cleanuppad` BBs are not very frequent and many
    people run other optimizers like Binaryen anyway.
    aheejin authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    473ef10 View commit details
    Browse the repository at this point in the history
  99. Revert "Reapply "[DebugInfo][RemoveDIs] Turn on non-instrinsic debug-…

    …info by default""
    
    This reverts commit d759618.
    
    Causes crashes, see comments in llvm/llvm-project@d759618.
    aeubanks committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    fd3a0c1 View commit details
    Browse the repository at this point in the history
  100. [libc][stdfix] Generate stdfix.h header with fixed point precision ma…

    …cros according to ISO/IEC TR 18037:2008 standard, and add fixed point type support detection. (#81255)
    
    Fixed point extension standard:
    https://standards.iso.org/ittf/PubliclyAvailableStandards/c051126_ISO_IEC_TR_18037_2008.zip
    lntue authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    84277fe View commit details
    Browse the repository at this point in the history
  101. Configuration menu
    Copy the full SHA
    9f87bfe View commit details
    Browse the repository at this point in the history
  102. Configuration menu
    Copy the full SHA
    c92bf6b View commit details
    Browse the repository at this point in the history
  103. [lldb][test] Switch LLDB API tests from vendored unittest2 to unittes…

    …t (#79945)
    
    This removes the dependency LLDB API tests have on
    lldb/third_party/Python/module/unittest2, and instead uses the standard
    one provided by Python.
    
    This does not actually remove the vendored dep yet, nor update the docs.
    I'll do both those once this sticks.
    
    Non-trivial changes to call out:
    - expected failures (i.e. "bugnumber") don't have a reason anymore, so
    those params were removed
    - `assertItemsEqual` is now called `assertCountEqual`
    - When a test is marked xfail, our copy of unittest2 considers failures
    during teardown to be OK, but modern unittest does not. See
    TestThreadLocal.py. (Very likely could be a real bug/leak).
    - Our copy of unittest2 was patched to print all test results, even ones
    that don't happen, e.g. `(5 passes, 0 failures, 1 errors, 0 skipped,
    ...)`, but standard unittest prints a terser message that omits test
    result types that didn't happen, e.g. `OK (skipped=1)`. Our lit
    integration parses this stderr and needs to be updated w/ that
    expectation.
    
    I tested this w/ `ninja check-lldb-api` on Linux. There's a good chance
    non-Linux tests have similar quirks, but I'm not able to uncover those.
    rupprecht authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    5b38615 View commit details
    Browse the repository at this point in the history
  104. [flang] Register LLVMTranslationDialectInterface for FIR. (#81668)

    Register the LLVM IR translation interface for FIR to avoid
    warnings about "Unhandled parameter attribute" after #78228.
    vzakhari authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    137bd78 View commit details
    Browse the repository at this point in the history
  105. [-Wunsafe-buffer-usage] Emit fixits for array decayed to pointer (#80…

    …347)
    
    Covers cases where DeclRefExpr referring to a const-size array decays to a
    pointer and is used "as a pointer" (e. g. passed to a pointer type
    parameter).
    
    Since std::array<T, N> doesn't implicitly convert to pointer to its element
    type T* the cast needs to be done explicitly as part of the fixit
    when we retrofit std::array to code that previously worked with constant
    size array. std::array::data() method is used for the explicit
    cast.
    
    In terms of the fixit machine this covers the UPC(DRE) case for Array fixit strategy.
    The emitted fixit inserts call to std::array::data() method similarly to
    analogous fixit for Span strategy.
    jkorous-apple authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    e06f352 View commit details
    Browse the repository at this point in the history
  106. [attributes][analyzer] Generalize [[clang::suppress]] to declarations…

    …. (#80371)
    
    The attribute is now allowed on an assortment of declarations, to
    suppress warnings related to declarations themselves, or all warnings in
    the lexical scope of the declaration.
    
    I don't necessarily see a reason to have a list at all, but it does look
    as if some of those more niche items aren't properly supported by the
    compiler itself so let's maintain a short safe list for now.
    
    The initial implementation raised a question whether the attribute
    should apply to lexical declaration context vs. "actual" declaration
    context. I'm using "lexical" here because it results in less warnings
    suppressed, which is the conservative behavior: we can always expand it
    later if we think this is wrong, without breaking any existing code. I
    also think that this is the correct behavior that we will probably never
    want to change, given that the user typically desires to keep the
    suppressions as localized as possible.
    haoNoQ authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    017675f View commit details
    Browse the repository at this point in the history
  107. [RISCV] Register fixed stack slots for callee saved registers for -ms…

    …ave-restore/Zcmp (#81392)
    
    PEI previously used fake frame indices for these callee saved registers.
    These fake frame indices are not register with MachineFrameInfo. This
    required them to be deleted form CalleeSavedInfo after PEI to avoid
    breaking later passes. See #79535
    
    Unfortunately, removing the registers from CalleeSavedInfo pessimizes
    Interprocedural Register Allocation. The RegUsageInfoCollector pass runs
    after PEI and uses CalleeSavedInfo.
    
    This patch replaces #79535 by properly creating fixed stack objects
    through MachineFrameInfo. This changes the stack size and offsets
    returned by MachineFrameInfo which requires changes to how
    RISCVFrameLowering uses that information.
    
    In addition to the individual object for each register, I've also create
    a single large fixed object that covers the entire stack area covered by
    cm.push or the libcalls. cm.push must always push a multiple of 16 bytes
    and the save restore libcall pushes a multiple of stack align. I think
    this leaves holes in the stack where we could spill other registers, but
    it matches what we did previously. Maybe we can optimize this in the
    future.
    
    The only test changes are due to stack alignment handling after the
    callee save registers. Since we now have the fixed objects, on the stack
    the offset is non-zero when an aligned object is processed so the offset
    gets rounded up, increasing the stack size.
    
    I suspect we might need some more updates for RVV related code. There is
    very little or maybe even no testing of RVV mixed with Zcmp and
    save-restore.
    topperc authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    0de2b26 View commit details
    Browse the repository at this point in the history
  108. [InstSimplify] Add trivial simplifications for gc.relocate intrinsic …

    …(#81639)
    
    Fold gc.relocate of undef and null to undef and null respectively.
    
    Similar transform is currently done by instcombine, but there is no
    reason to not include it here as well.
    danilaml authored Feb 13, 2024
    Configuration menu
    Copy the full SHA
    cb1a9f7 View commit details
    Browse the repository at this point in the history
  109. [gn] fix typo in 8c56e78

    The missing trailing comma confuses the sync script.
    nico committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    4bc2a4f View commit details
    Browse the repository at this point in the history
  110. Configuration menu
    Copy the full SHA
    bf3d5db View commit details
    Browse the repository at this point in the history
  111. Configuration menu
    Copy the full SHA
    a6b846a View commit details
    Browse the repository at this point in the history
  112. [gn build] Port a6b846a

    llvmgnsyncbot committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    9168a21 View commit details
    Browse the repository at this point in the history
  113. Configuration menu
    Copy the full SHA
    3122969 View commit details
    Browse the repository at this point in the history

Commits on Feb 14, 2024

  1. Used std::vector::reserve when I meant std::vector::resize.

    The Linux std has more asserts enabled by default, so it
    complained, even though this worked on Darwin...
    jimingham committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    3647ff1 View commit details
    Browse the repository at this point in the history
  2. [RISCV] Add canonical ISA string as Module metadata in IR. (#80760)

    In an LTO build, we don't set the ELF attributes to indicate what
    extensions were compiled with. The target CPU/Attrs in
    RISCVTargetMachine do not get set for an LTO build. Each function gets a
    target-cpu/feature attribute, but this isn't usable to set ELF attributs
    since we wouldn't know what function to use. We can't just once since it
    might have been compiler with an attribute likes target_verson.
    
    This patch adds the ISA as Module metadata so we can retrieve it in the
    backend. Individual translation units can still be compiled with
    different strings so we need to collect the unique set when Modules are
    merged.
    
    The backend will need to combine the unique ISA strings to produce a
    single value for the ELF attributes. This will be done in a separate
    patch.
    topperc authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    f45b9d9 View commit details
    Browse the repository at this point in the history
  3. [X86][CodeGen] Restrict F128 lowering to GNU environment (#81664)

    Otherwise it breaks some environment like X64 Android that doesn't have
    f128 functions available in its libc.
    
    Followup to #79611.
    pranavk authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    21630ef View commit details
    Browse the repository at this point in the history
  4. [mlir][sparse][pybind][CAPI] remove LevelType enum from CAPI, constru…

    …… (#81682)
    
    …ct LevelType from LevelFormat and properties instead.
    
    **Rationale**
    We used to explicitly declare every possible combination between
    `LevelFormat` and `LevelProperties`, and it now becomes difficult to
    scale as more properties/level formats are going to be introduced.
    PeimingLiu authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    429919e View commit details
    Browse the repository at this point in the history
  5. Temporarily skip this test for Python 3.9.

    When the parsed command python code is run on 3.9, I get:
    
    File ".../lib/python3.9/site-packages/lldb/plugins/parsed_cmd.py", line 124, in translate_value
        return cls.translators[value_type](value)
    TypeError: 'staticmethod' object is not callable
    
    But this works correctly in Python 3.10 on macOS and Linux.  I'm guessing something
    changed between those versions, and I'll have to do something to work around the difference.
    But I'm going to skip the test on 3.9 while I figure that out.
    jimingham committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    1ec8197 View commit details
    Browse the repository at this point in the history
  6. [SeparateConstOffsetFromGEP] Reland: Reorder trivial GEP chains to se…

    …parate constants (#81671)
    
    Actually update tests w.r.t
    llvm/llvm-project@9e5a77f
    and reland llvm/llvm-project#73056
    jrbyrnes authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    7180c23 View commit details
    Browse the repository at this point in the history
  7. Merge from 'sycl' to 'sycl-web' (3 commits)

    iclsrc committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    1dbb9fd View commit details
    Browse the repository at this point in the history
  8. [AMDGPU][MLIR]Add shmem-optimization as an op using transform dialect…

    … (#81550)
    
    This PR adds functionality to use shared memory optimization as an op
    using transform dialect.
    erman-gurses authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    29d1aca View commit details
    Browse the repository at this point in the history
  9. Merge from 'main' to 'sycl-web' (33 commits)

      CONFLICT (content): Merge conflict in llvm/lib/IR/BasicBlock.cpp
    jyu2-git committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    d33f6f4 View commit details
    Browse the repository at this point in the history
  10. Move the parsed_cmd conversion def's to module level functions.

    Python3.9 does not allow you to put a reference to a class staticmethod
    in a table and call it from there.  Python3.10 and following do allow
    this, but we still support 3.9.  staticmethod was slightly cleaner,
    but this will do.
    jimingham committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    22d2f3a View commit details
    Browse the repository at this point in the history
  11. [llvm][Support] Add ExponentialBackoff helper (#81206)

    This provides a simple way to implement exponential backoff using a do
    while loop.
    
    Usage example (also see the change to LockFileManager.cpp):
    ```
    ExponentialBackoff Backoff(10s);
    do {
      if (tryToDoSomething())
        return ItWorked;
    } while (Backoff.waitForNextAttempt());
    return Timeout;
    ```
    
    Abstracting this out of `LockFileManager` as the module build daemon
    will need it.
    Bigcheese authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    edff3ff View commit details
    Browse the repository at this point in the history
  12. [gn build] Port edff3ff

    llvmgnsyncbot committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    14b0d0d View commit details
    Browse the repository at this point in the history
  13. [clang][InstallAPI] Introduce basic driver to write out tbd files (#…

    …81571)
    
    This introduces a basic outline of installapi as a clang driver option.
    It captures relevant information as cc1 args, which are common arguments
    already passed to the linker to encode into TBD file outputs. This is
    effectively an upstream for what already exists as `tapi installapi` in
    Xcode toolchains, but directly in Clang. This patch does not handle any
    AST traversing on input yet.
    
    InstallAPI is broadly an operation that takes a series of header files
    that represent a single dynamic library and generates a TBD file out of
    it which represents all the linkable symbols and necessary attributes
    for statically linking in clients. It is the linkable object in all
    Apple SDKs and when building dylibs in Xcode. `clang -installapi` also
    will support verification where it compares all the information recorded
    for the TBD files against the already built binary, to catch possible
    mismatches like when a declaration is missing a definition for an
    exported symbol.
    cyndyishida authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    09e9895 View commit details
    Browse the repository at this point in the history
  14. [SHT_LLVM_BB_ADDR_MAP][obj2yaml] Implements PGOAnalysisMap for elf2ya…

    …ml and tests. (#80924)
    
    Adds support to obj2yaml for PGO Analysis Map. Adds a test to both
    obj2yaml and yaml2obj.
    red1bluelost authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    a3f61c8 View commit details
    Browse the repository at this point in the history
  15. [InstallAPI] Add missing link to clangBasic

    Fixes CI.
    cyndyishida committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    ec5f4a4 View commit details
    Browse the repository at this point in the history
  16. [Sanitizers][ABI] Remove too strong assert in asan_abi_shim (#81696)

    Recently we enabled building the shim for arm64_32 arch. On this arch,
    sizeof(uptr) == sizeof(unsigned long) == 4 - so this assert will fail in
    runtime.
    
    Need to just remove this assert
    
    rdar://122927166
    
    Co-authored-by: Mariusz Borsa <m_borsa@apple.com>
    wrotki and Mariusz Borsa authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    3f738a4 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    bc08cc2 View commit details
    Browse the repository at this point in the history
  18. [RISCV] Use SelectionDAG::getVScale in lowerVPReverseExperimental. NF…

    …CI (#81694)
    
    Use a slightly more idiomatic way of getting vscale. getVScale
    performs additional constant folding, but I presume computeKnownBits
    also catches these cases too.
    lukel97 authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    b9567bc View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    69bcb69 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    a854982 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    d2f0676 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    153661d View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    70ebc78 View commit details
    Browse the repository at this point in the history
  24. Revert "[clang-format][NFC] Make LangOpts global in namespace Format"

    This reverts commit 32e65b0.
    
    It seems to break some PowerPC bots.
    
    See llvm/llvm-project#81390 (comment).
    owenca committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    61c83e9 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    eafe98f View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    3537ccc View commit details
    Browse the repository at this point in the history
  27. [DAGCombiner] Remove unnecessary commonAlignment from CombineExtLoad.…

    … (#81705)
    
    The getAlign function for a load returns the commonAlignment of the
    "base align" and the offset stored in the MachinePointerInfo.
    
    We're splitting a load here, so we should take the base alignment from
    the original load without any offset that may already exist in the
    original load. The new load can then maintain its own alignment using
    just the base alignment and its own offset.
    
    Noticed by inspection.
    topperc authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    e625310 View commit details
    Browse the repository at this point in the history
  28. [DAGCombiner] Remove unneeded commonAlignment from reduceLoadWidth. (…

    …#81707)
    
    We already have the PtrOff factored into MachinePointerInfo. Any calls
    to getAlign on the new load with do commonAlignment with the
    MachinePointerInfo offset and the base alignment.
    topperc authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    86ce491 View commit details
    Browse the repository at this point in the history
  29. [mlir][nvvm] Introduce nvvm.barrier OP (#81487)

    This PR that introduces the `nvvm.barrier` OP to the NVVM dialect.
    Currently, NVVM only supports the `nvvm.barrier0`, which synchronizes
    all threads using barrier resource 0.
    
    The new `nvvm.barrier` has two essential arguments: the barrier resource
    and the number of threads. This added flexibility allows for selective
    synchronization of threads within a CTA, aligning with the capabilities
    provided by LLVM intrinsics or the PTX model.
    
    I think we can deprecate `nvvm.barrier0` in favor of the more generic
    `nvvm.barrier`.
    
    ```
    // Equivalent to nvvm.barrier0 (or __syncthreads() in CUDA)
    nvvm.barrier
    
    // Synchronize all threads using the 3rd barrier resource.
    nvvm.barrier id = 3
    
    // Synchronize %numberOfThreads threads using the 3rd barrier resource.
    nvvm.barrier id = 3 number_of_threads = %numberOfThreads
    ```
    grypp authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    b5d694b View commit details
    Browse the repository at this point in the history
  30. [ValueTracking] Move the isSignBitCheck helper into ValueTracking. …

    …NFC. (#81704)
    
    This patch moves the `isSignBitCheck` helper into ValueTracking to reuse
    the logic in ValueTracking/InstSimplify.
    
    Addresses the comment
    llvm/llvm-project#80740 (comment).
    dtcxzyw authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    dc866ae View commit details
    Browse the repository at this point in the history
  31. [clang][analyzer] Reformat code of BoolAssignmentChecker (NFC). (#81461)

    This is only a code reformatting and rename of variables to the newer
    format.
    balazske authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    a2eb234 View commit details
    Browse the repository at this point in the history
  32. [RISCV] Remove -riscv-v-fixed-length-vector-lmul-max from tests. NFC …

    …(#78299)
    
    Some fixed vector tests in test/CodeGen/RISCV/rvv have multiple run
    lines that
    check various configurations of -riscv-v-fixed-length-vector-lmul-max.
    From
    what I understand this flag was introduced in the early days of fixed
    length
    vector support, but now that fixed vector codegen has matured I'm not
    sure if
    it's as relevant today.
    
    This patch proposes to remove the various lmul-max run lines from the
    tests to
    make them more readable, and any changes to fixed vector codegen easier
    to
    review.
    
    We have removed them before for the same reason, so this would take care
    of the
    remaining test cases: https://reviews.llvm.org/D157973#4593268
    
    (I don't have any strong motivation to remove the actual flag itself, my
    own
    personal motivation is just to clean up the tests)
    lukel97 authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0fee211 View commit details
    Browse the repository at this point in the history
  33. [bazel] Port for 09e9895

    hokein committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    bd2f7bb View commit details
    Browse the repository at this point in the history
  34. clangCodeGen: Introduce MCDC::State with MCDCState.h (#81497)

    This packs;
    * `BitmapBytes`
    * `BitmapMap`
    * `CondIDMap`
    
    into `MCDC::State`.
    chapuni authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    5c8985e View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    243f14d View commit details
    Browse the repository at this point in the history
  36. [InstSimplify][InstCombine] Remove unnecessary m_c_* matchers. (#81…

    …712)
    
    This patch removes unnecessary `m_c_*` matchers since we always
    canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`.
    
    Compile-time impact:
    https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u
    dtcxzyw authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    470c5b8 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    5932f3f View commit details
    Browse the repository at this point in the history
  38. Configuration menu
    Copy the full SHA
    855bac2 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    8f0435f View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    17ac5b1 View commit details
    Browse the repository at this point in the history
  41. [AMDGPU] Do not test both wave sizes for DSDIR disassembly (#81719)

    There is nothing in these instruction definitions that depends on wave
    size so testing both seems like overkill. The corresponding assembler
    tests do not do it.
    jayfoad authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    cb8f910 View commit details
    Browse the repository at this point in the history
  42. [DeadStoreElimination] Optimize tautological assignments (#75744)

    If a store is dominated by a condition that ensures that the value being
    stored in a memory location is already present at that memory location,
    consider the store a noop.
    
    Fixes #63419
    BK1603 authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    65b5647 View commit details
    Browse the repository at this point in the history
  43. [mlir][nfc] Move Op signature to one line

    This was accidentally split with a comment
    banach-space committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    55a7ff8 View commit details
    Browse the repository at this point in the history
  44. Revert "[GitHub][workflows] Ask reviewers to merge PRs when author ca…

    …nnot (#81142)"
    
    This reverts commit 38c706e.
    
    This workflow always fails in cases where it needs to create a
    comment, due to a permissions issue, see the discussion at:
    https://discourse.llvm.org/t/rfc-fyi-pull-request-greetings-for-new-contributors/75458/20
    nikic committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    124cd11 View commit details
    Browse the repository at this point in the history
  45. [X86] Use explicit const SDValue& to avoid implicit copy in for-range…

    … across op_values(). NFC.
    
    Fixes static analysis warning.
    RKSimon committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    786537e View commit details
    Browse the repository at this point in the history
  46. [X86] Add v8i64/v16i32/v16i64 ctpop reduction test coverage

    Add test coverage for types wider than legal
    RKSimon committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    f82e080 View commit details
    Browse the repository at this point in the history
  47. [VPlan] Properly retain flags when cloning VPReplicateRecipe.

    This makes sure the correct flags are used for the clone (i.e. the ones
    present on the recipe), instead of the ones on the original IR
    instruction.
    
    At the moment, this should not change anything, as flags of replicate
    recipe should not be dropped before they are cloned at the moment. But
    that will change in a follow-up patch.
    fhahn committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    ca56966 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    f1b2865 View commit details
    Browse the repository at this point in the history
  49. [llvm-dlltool][NFC] Factor out parseModuleDefinition helper. (#81620)

    In preparation for ARM64EC support.
    cjacek authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0c8b594 View commit details
    Browse the repository at this point in the history
  50. [MLIR][Python] Added a base class to all builtin floating point types…

    … (#81720)
    
    This allows to
    
    * check if a given ir.Type is a floating point type via isinstance() or
    issubclass()
    * get the bitwidth of a floating point type
    
    See motivation and discussion in
    https://discourse.llvm.org/t/add-floattype-to-mlir-python-bindings/76959.
    superbobry authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    82f3cbc View commit details
    Browse the repository at this point in the history
  51. [AArch64] Add tests for fusion on Ampere1/1A/1B (#81725)

    As commented on the PR #81293, the Ampere1-family does not have test
    cases for the common fusion cases it implements. This adds the Ampere1
    targets to the relevant misched-fusion testcases:
     * addadrp
     * addr
     * aes
    ptomsich authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    6cab375 View commit details
    Browse the repository at this point in the history
  52. [VPlan] Move dropping of poison flags to VPlanTransforms. (NFC)

    Move collectPoisonGeneratingFlags from InnerLoopVectorizer to
    VPlanTransforms and also update its name. collectPoisonGeneratingFlags
    already directly drops poison-generating flags, not only collecting it.
    This means it is more appropriate to integerate it directly into the
    VPlan transform pipeline.
    
    The current implementation still calls back to legal to check if a block
    needs predication, which should be improved in the future.
    fhahn committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    debca7e View commit details
    Browse the repository at this point in the history
  53. [clang][NFC] Use "notable" for "interesting" identifiers in `Identifi…

    …erInfo` (#81542)
    
    This patch expands notion of "interesting" in `IdentifierInto` it to
    also cover ObjC keywords and builtins, which matches notion of
    "interesting" in serialization layer. What was previously "interesting"
    in `IdentifierInto` is now called "notable".
    
    Beyond clearing confusion between serialization and the rest of the
    compiler, it also resolved a naming problem: ObjC keywords, notable
    identifiers, and builtin IDs are all stored in the same bit-field. Now
    we can use "interesting" to name it and its corresponding type, instead
    of `ObjCKeywordOrInterestingOrBuiltin` abomination.
    Endilll authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    5027569 View commit details
    Browse the repository at this point in the history
  54. [clang][docs] Remove trailing whitespace

    Which is causing CI checks to fail.
    
    clang/docs/LanguageExtensions.rst:2794:takes no arguments and produces an unsigned long long result. The builtin does
    clang/docs/LanguageExtensions.rst:2795:not guarantee any particular frequency, only that it is stable. Knowledge of the
    + echo '*** Trailing whitespace has been found in Clang source files as described above ***'
    DavidSpickett committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    c5e1384 View commit details
    Browse the repository at this point in the history
  55. [ValueTracking] Compute known FPClass from signbit idiom (#80740)

    This patch improves `computeKnownFPClass` by using context-sensitive
    information from `DomConditionCache`.
    The motivation of this patch is to optimize the following case found in
    [fmt/format.h](https://github.com/fmtlib/fmt/blob/e17bc67547a66cdd378ca6a90c56b865d30d6168/include/fmt/format.h#L3555-L3566):
    ```
    define float @test(float %x, i1 %cond) {
      %i32 = bitcast float %x to i32
      %cmp = icmp slt i32 %i32, 0
      br i1 %cmp, label %if.then1, label %if.else
    
    if.then1:
      %fneg = fneg float %x
      br label %if.end
    
    if.else:
      br i1 %cond, label %if.then2, label %if.end
    
    if.then2:
      br label %if.end
    
    if.end:
      %value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ]
      %ret = call float @llvm.fabs.f32(float %value)
      ret float %ret
    }
    ```
    We can prove the sign bit of %value is always zero. Then the fabs can be
    eliminated.
    
    This pattern also exists in cpython/duckdb/oiio/openexr.
    
    Compile-time impact:
    https://llvm-compile-time-tracker.com/compare.php?from=f82e0809ba12170e2f648f8a1ac01e78ef06c958&to=041218bf5491996edd828cc15b3aec5a59ddc636&stat=instructions:u
    
    
    |stage1-O3|stage1-ReleaseThinLTO|stage1-ReleaseLTO-g|stage1-O0-g|stage2-O3|stage2-O0-g|stage2-clang|
    |--|--|--|--|--|--|--|
    |-0.00%|+0.01%|+0.00%|-0.03%|+0.00%|+0.00%|+0.02%|
    dtcxzyw authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    16a0629 View commit details
    Browse the repository at this point in the history
  56. [libc] Add user defined literals to initialize BigInt and `__uint12…

    …8_t` constants (#81267)
    
    Adds user defined literal to construct unsigned integer constants. This
    is useful when constructing constants for non native C++ types like
    `__uint128_t` or our custom `BigInt` type.
    gchatelet authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0323235 View commit details
    Browse the repository at this point in the history
  57. [TableGen] Stop using make_pair and make_tuple. NFC. (#81730)

    These are unnecessary since C++17.
    jayfoad authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    f723260 View commit details
    Browse the repository at this point in the history
  58. [AArch64] Materialize constants via fneg. (#80641)

    This is something that is already done as a special case for copysign,
    this patch extends it to be more generally applied. If we are trying to
    matrialize a negative constant (notably -0.0, 0x80000000), then there
    may be no movi encoding that creates the immediate, but a fneg(movi)
    might.
    
    Some of the existing patterns for RADDHN needed to be adjusted to keep
    them in line with the new immediates.
    davemgreen authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    6c84709 View commit details
    Browse the repository at this point in the history
  59. [mlir][python] expose LLVMStructType API (#81672)

    Expose the API for constructing and inspecting StructTypes from the LLVM
    dialect. Separate constructor methods are used instead of overloads for
    better readability, similarly to IntegerType.
    ftynse authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    bd8fcf7 View commit details
    Browse the repository at this point in the history
  60. [C23] Do not diagnose binary literals as an extension (#81658)

    We previously would diagnose them as a GNU extension in C mode, but they
    are now a feature of C23. The -Wgnu-binary-literal warning group no
    longer controls any diagnostics as this is no longer a GNU extension.
    The warning group is retained as a noop to help avoid "unknown warning"
    diagnostics.
    
    This also adds the companion compatibility warning which existed for C++
    but not for C.
    
    Fixes llvm/llvm-project#72017
    AaronBallman authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    8e24bc0 View commit details
    Browse the repository at this point in the history
  61. [MC/DC] Refactor: Introduce ConditionIDs as std::array<2> (#81221)

    Its 0th element corresponds to `FalseID` and 1st to `TrueID`.
    
    CoverageMappingGen.cpp: `DecisionIDPair` is replaced with `ConditionIDs`
    chapuni authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    1a1fcac View commit details
    Browse the repository at this point in the history
  62. [AMDGPU] Replace '.' with '-' in generic target names (#81718)

    The dot is too confusing for tools. Output temporaries would have
    '10.3-generic' so tools could parse it as an extension, device libs &
    the associated clang driver logic are also confused by the dot.
    
    After discussions, we decided it's better to just remove the '.' from
    the target name than fix each issue one by one.
    Pierre-vh authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    43c7eb5 View commit details
    Browse the repository at this point in the history
  63. [AArch64] Initial Ampere1B scheduling model (#81341)

    The Ampere1B core is enabled with a new scheduling/pipeline model, as it
    provides significant updates over the Ampere1 core; it reduces latencies
    on many instructions, has some micro-ops reassigned between the XY and X
    units, and provides modelling for the instructions added since Ampere1
    and Ampere1A.
    
    As this is the first model implementing the CSSC instructions, we update
    the UnsupportedFeatures on all other models (that have CompleteModel
    set).
        
    Testcases are added under llvm-mca: these showed the FullFP16 feature
    missing, so we are adding it in as part of this commit.
    
    This *adds tests and additional fixes* compared to the reverted #81338.
    ptomsich authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    dd1897c View commit details
    Browse the repository at this point in the history
  64. [gn] port 09e9895 (InstallAPI)

    nico committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    2d7fdfa View commit details
    Browse the repository at this point in the history
  65. [RemoveDIs] Replicate dbg intrinsic movement pattern in SelectOptimiz…

    …e (#81737)
    
    Fix crash mentioned in comments on
    d759618.
    
    The assertion being hit was complaining that we had dangling DPValues;
    the DPValues attached to the terminator of StartBlock become dangling
    after the terminator is erased, and they're never "flushed" back onto
    the new terminator once it's added. Doing that makes the crash go away,
    but doesn't replicate existing dbg.* behaviour. See the comment in the
    patch.
    
    This change both fixes the crash (because there are now no DPValues left
    on the terminator to dangle) and replicates existing behaviour (moves
    those DPValues down to the new block).
    OCHyams authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    a50bd0d View commit details
    Browse the repository at this point in the history
  66. [clang][Interp][NFC] Add missing special cases for implicit functions

    We have this special case in getSource() and getRange(), but we
    were missing it in getExpr() and getLocation().
    tbaederr committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    b37bd78 View commit details
    Browse the repository at this point in the history
  67. [mlir] update bazel for bd8fcf7

    ftynse committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    232cf94 View commit details
    Browse the repository at this point in the history
  68. [AMDGPU] Refactor export instruction definitions. NFC. (#81738)

    Using multiclasses for the Real instruction definitions has a couple of
    benefits:
    - It avoids repeating information that was already specified when
      defining the corresponding pseudo, like the row and done bits.
    - It allows commoning up the Real definitions for architectures which
      are mostly the same, like GFX11 and GFX12.
    jayfoad authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    9c06b07 View commit details
    Browse the repository at this point in the history
  69. [NFC] Add API documentation and annotations (#78635)

    This change adds SM 6.2 availability annotation to 16-bit APIs (16-bit
    types require SM 6.2), and adds Doxygen API documentation.
    llvm-beanz authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    457c179 View commit details
    Browse the repository at this point in the history
  70. [bazel][mlir] Fix after 232cf94

    chsigg authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    995c906 View commit details
    Browse the repository at this point in the history
  71. [mlir][Transforms][NFC] Improve listener layering in dialect conversi…

    …on (#81236)
    
    Context: Conversion patterns provide a `ConversionPatternRewriter` to
    modify the IR. `ConversionPatternRewriter` provides the public API. Most
    function calls are forwarded/handled by `ConversionPatternRewriterImpl`.
    The dialect conversion uses the listener infrastructure to get notified
    about op/block insertions.
    
    In the current design, `ConversionPatternRewriter` inherits from both
    `PatternRewriter` and `Listener`. The conversion rewriter registers
    itself as a listener. This is problematic because listener functions
    such as `notifyOperationInserted` are now part of the public API and can
    be called from conversion patterns; that would bring the dialect
    conversion into an inconsistent state.
    
    With this commit, `ConversionPatternRewriter` no longer inherits from
    `Listener`. Instead `ConversionPatternRewriterImpl` inherits from
    `Listener`. This removes the problematic public API and also simplifies
    the code a bit: block/op insertion notifications were previously
    forwarded to the `ConversionPatternRewriterImpl`. This is no longer
    needed.
    matthias-springer authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    ea2d938 View commit details
    Browse the repository at this point in the history
  72. [LoopVectorize] Fix divide-by-zero bug (#80836) (#81721)

    When attempting to use the estimated trip count to refine the costs of
    the runtime memory checks we should also check for sane trip counts to
    prevent divide-by-zero faults on some platforms.
    
    Fixes #80836
    david-arm authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    1c10821 View commit details
    Browse the repository at this point in the history
  73. [mlir][Transforms][NFC] Modularize block actions (#81237)

    Throughout the rewrite process, the dialect conversion maintains a list
    of "block actions" that can be rolled back upon failure. This commit
    encapsulates the existing block actions into separate classes, making it
    easier to add additional actions in the future.
    
    This commit also renames "block actions" to "IR rewrites". In a
    subsequent commit, an "operation rewrite" class that allows rolling back
    movements of single operations is added. This is to support
    `moveOpBefore` in the dialect conversion.
    
    Rewrites have two methods: `commit()` commits an action. It can no
    longer be rolled back afterwards. `rollback()` undoes a rewrite. It can
    no longer be committed afterwards.
    matthias-springer authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    8faefe3 View commit details
    Browse the repository at this point in the history
  74. Reapply "[DebugInfo][RemoveDIs] Turn on non-instrinsic debug-info by …

    …default"
    
    This reapplies commit bdde5f9 by undoing the revert fd3a0c1.
    
    The previous reapplication d759618 was reverted due to a crash
    (reproducer in comments for d759618) which was fixed in #81737.
    
    As noted in the original commit, this commit may break downstream tests.
    If this commit is breaking your downstream tests, please see comment 12 in
    [0], which documents the kind of variation in tests we'd expect to see from
    this change and what to do about it.
    
    [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
    OCHyams committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    a93a4ec View commit details
    Browse the repository at this point in the history
  75. Configuration menu
    Copy the full SHA
    2347a47 View commit details
    Browse the repository at this point in the history
  76. [mlir][Transforms] Support moveOpBefore/After in dialect conversi…

    …on (#81240)
    
    Add a new rewrite class for "operation movements". This rewrite class
    can roll back `moveOpBefore` and `moveOpAfter`.
    
    `RewriterBase::moveOpBefore` and `RewriterBase::moveOpAfter` is no
    longer virtual. (The dialect conversion can gather all required
    information for rollbacks from listener notifications.)
    matthias-springer authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    8f4cd2c View commit details
    Browse the repository at this point in the history
  77. [libc][__support][bit] remove compiler has builtin checks (#81679)

    We only support building llvmlibc with modern compilers.
    https://libc.llvm.org/compiler_support.html#minimum-supported-versions
    
    All versions of the these compilers support these builtins; GCC does not
    support the short variants.
    nickdesaulniers authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    4efbf52 View commit details
    Browse the repository at this point in the history
  78. [libc][__support][bit] simplify FLZ (#81678)

    `countl_zero(~x)` *is* `countl_one(x)`
    nickdesaulniers authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0f6f5bf View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    7c4c274 View commit details
    Browse the repository at this point in the history
  80. Configuration menu
    Copy the full SHA
    6059671 View commit details
    Browse the repository at this point in the history
  81. [lld/ELF] Avoid unnecessary TPOFF relocations in GOT for -pie (#81739)

    With the new SystemZ port we noticed that -pie executables generated
    from files containing R_390_TLS_IEENT relocations will have unnecessary
    relocations in their GOT:
    
                            9e8d8: R_390_TLS_TPOFF  *ABS*+0x18
    
    This is caused by the config->isPic conditon in addTpOffsetGotEntry:
    
     static void addTpOffsetGotEntry(Symbol &sym) {
       in.got->addEntry(sym);
       uint64_t off = sym.getGotOffset();
       if (!sym.isPreemptible && !config->isPic) {
         in.got->addConstant({R_TPREL, target->symbolicRel, off, 0, &sym});
         return;
       }
    
    It is correct that we need to retain a TPOFF relocation if the target
    symbol is preemptible or if we're building a shared library. But when
    building a -pie executable, those values are fixed at link time and
    there's no need for any remaining dynamic relocation.
    
    Note that the equivalent MIPS-specific code in MipsGotSection::build
    checks for config->shared instead of config->isPic; we should use the
    same check here. (Note also that on many other platforms we're not even
    using addTpOffsetGotEntry in this case as an IE->LE relaxation is
    applied before; we don't have this type of relaxation on SystemZ.)
    uweigand authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    6f90773 View commit details
    Browse the repository at this point in the history
  82. Configuration menu
    Copy the full SHA
    411554a View commit details
    Browse the repository at this point in the history
  83. [polly][ScheduleOptimizer] Use IslMaxOperationsGuard helper instead o…

    …f explicit restoration (#79303)
    
    To fix long compile time issue of Schedule optimizer, patch #77280 sets
    the upper cap on max ISL operations. In case of bailing out when ISL
    quota is hit, error handling behavior was restored manually. This commit
    replaces the restoration code with IslMaxOperationsGuard helper and also
    removes redundant early return.
    kartcq authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0f33c54 View commit details
    Browse the repository at this point in the history
  84. Configuration menu
    Copy the full SHA
    78d401b View commit details
    Browse the repository at this point in the history
  85. [Clang][CodeGen] Loose the cast check when emitting builtins (#81669)

    This patch looses the cast check (`canLosslesslyBitCastTo`) and leaves
    it to the
    one inside `CreateBitCast`. It seems too conservative for the use case
    here.
    shiltian authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    630f82e View commit details
    Browse the repository at this point in the history
  86. [lldb] Fix the flakey Concurrent tests on macOS (#81710)

    The concurrent tests all do a pthread_join at the end, and
    concurrent_base.py stops after that pthread_join and sanity checks that
    only 1 thread is running. On macOS, after pthread_join() has completed,
    there can be an extra thread still running which is completing the
    details of that task asynchronously; this causes testsuite failures.
    When this happens, we see the second thread is in
    
    ```
    frame #0: 0x0000000180ce7700 libsystem_kernel.dylib`__ulock_wake + 8
    frame #1: 0x0000000180d25ad4 libsystem_pthread.dylib`_pthread_joiner_wake + 52
    frame #2: 0x0000000180d23c18 libsystem_pthread.dylib`_pthread_terminate + 384
    frame #3: 0x0000000180d23a98 libsystem_pthread.dylib`_pthread_terminate_invoke + 92
    frame #4: 0x0000000180d26740 libsystem_pthread.dylib`_pthread_exit + 112
    frame #5: 0x0000000180d26040 libsystem_pthread.dylib`_pthread_start + 148
    ```
    
    there are none of the functions from the test file present on this
    thread.
    
    In this patch, instead of counting the number of threads, I iterate over
    the threads looking for functions from our test file (by name) and only
    count threads that have at least one of them.
    
    It's a lower frequency failure than the darwin kernel bug causing an
    extra step instruction mach exception when hardware
    breakpoint/watchpoints are used, but once I fixed that, this came up as
    the next most common failure for these tests.
    
    rdar://110555062
    jasonmolenda authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    dbc40b3 View commit details
    Browse the repository at this point in the history
  87. Configuration menu
    Copy the full SHA
    1ddc541 View commit details
    Browse the repository at this point in the history
  88. Configuration menu
    Copy the full SHA
    8383bf2 View commit details
    Browse the repository at this point in the history
  89. Configuration menu
    Copy the full SHA
    89dc313 View commit details
    Browse the repository at this point in the history
  90. Configuration menu
    Copy the full SHA
    bf4480d View commit details
    Browse the repository at this point in the history
  91. Configuration menu
    Copy the full SHA
    d99d258 View commit details
    Browse the repository at this point in the history
  92. [RISCV] Split long build_vector sequences to reduce critical path (#8…

    …1312)
    
    If we have a long chain of vslide1down instructions to build e.g. a <16
    x i8> from scalar, we end up with a critical path going through the
    entire chain. We can instead build two halves, and then combine them
    with a vselect. This costs one additional temporary register, but
    reduces the critical path by roughly half.
    
    To avoid needing to change VL, we fill each half with undefs for the
    elements which will come from the other half. The vselect will at worst
    become a vmerge, but is often folded back into the final instruction of
    the sequence building the lower half.
    
    A couple notes on the heuristic here:
    * This is restricted to LMUL1 to avoid quadratic costing reasoning.
    * This only splits once. In future work, we can explore recursive
    splitting here, but I'm a bit worried about register pressure and thus
    decided to be conservative. It also happens to be "enough" at the
    default zvl of 128.
    * "8" is picked somewhat arbitrarily as being "long". In practice, our
    build_vector codegen for 2 defined elements in a VL=4 vector appears to
    need some work. 4 defined elements in a VL=8 vector seems to generally
    produce reasonable results.
    * Halves may not be an optimal split point. I went down the rabit hole
    of trying to find the optimal one, and decided it wasn't worth the
    effort to start with.
    
    ---------
    
    Co-authored-by: Luke Lau <luke_lau@icloud.com>
    preames and lukel97 authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    275eeda View commit details
    Browse the repository at this point in the history
  93. [lldb][NFCI] Remove CommandObjectProcessHandle::VerifyCommandOptionVa…

    …lue (#79901)
    
    I was refactoring something else but ran into this function. It was
    somewhat confusing to read through and understand, but it boils down to
    two steps:
    - First we try `OptionArgParser::ToBoolean`. If that works, then we're
    good to go.
    - Second, we try `llvm::to_integer` to see if it's an integer. If it
    parses to 0 or 1, we're good.
    - Failing either of the steps above means we cannot parse it into a
    bool.
    
    Instead of having an integer out param and a bool return value, the
    interface is better served with an optional<bool> -- Either it parses
    into true or false, or you get back nothing (nullopt).
    bulbazord authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    307cd88 View commit details
    Browse the repository at this point in the history
  94. Configuration menu
    Copy the full SHA
    16e7d68 View commit details
    Browse the repository at this point in the history
  95. [clang][CodeGen] Shift relink option implementation away from module …

    …cloning (#81693)
    
    We recently implemented a new option allowing relinking of bitcode
    modules via the "-mllvm -relink-builtin-bitcode-postop"
    option.
    
    This implementation relied on llvm::CloneModule() in order to pass
    copies to modules and preserve the original modules for later relinking.
    However, cloning modules has been found to be prohibitively expensive,
    significantly increasing compilation time for large bitcode libraries.
    
    In this patch, we shift the relink option implementation to instead link
    the original modules initially, and reload modules from the file system
    if relinking is requested. This approach results in significantly
    reduced overhead.
    
    We accomplish this by creating a new ReloadModules() routine that can be
    called from a BackendConsumer class, to mimic the behavior of
    ASTConsumer's loadLinkModules(), but without access to the
    CompilerInstance.
    
    Because loading the bitcodes from the filesystem requires access to the
    FileManager class, we also forward a reference to the CompilerInstance
    class to the BackendConsumer. This mirrors what is already done for
    several CompilerInstance members, such as TargetOptions and
    CodeGenOptions.
    
    Finally, we needed to add a const specifier to the
    FileManager::getBufferForFile() routine to allow it to be called using
    the const reference returned from CompilerInstance::getFileManager()
    lamb-j authored Feb 14, 2024
    Configuration menu
    Copy the full SHA
    6d4ffbd View commit details
    Browse the repository at this point in the history
  96. Merge from 'sycl' to 'sycl-web'

    iclsrc committed Feb 14, 2024
    Configuration menu
    Copy the full SHA
    0728f6d View commit details
    Browse the repository at this point in the history

Commits on Feb 15, 2024

  1. [SYCL][Graph] Add node and graph queries for mixed usage (intel#12366)

    This PR adds queries to both nodes and modifiable graphs which enable
    better mixed usage of both the explicit and record & replay APIs in a
    single program.
    
    It also reworks how subgraphs are handled: previously nodes were merged
    into the modifiable graph, but this would pose a problem for users
    querying the graph since they would not see a single subgraph node, and
    this merging behaviour was an implementation detail. This has been
    changed so that now subgraph nodes are only merged in the executable
    graph, and are stored as a single node of type `subgraph` in the
    modifiable graph.
    
    As a consequence of this change all nodes are now also copied when
    making the executable graph, where previously they were not.
    
    - Reworked how subgraphs are handled
    - Add graph and node queries to the SYCL-Graph spec
    - Implement graph and node queries
    - New node_type enum
    - Explicit nodes now also have associated events (fixes mixed usage
    issue)
    - New tests for queries
    - Update ABI symbols
    Bensuo authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    5337a8a View commit details
    Browse the repository at this point in the history
  2. [SYCL][Fusion] Set IsNewDbgInfoFormat when creating new functions (i…

    …ntel#12712)
    
    Set `IsNewDbgInfoFormat` to the default value for functions created in
    the SYCL Kernel Fusion pipeline. This prepares `sycl-fusion` for
    migration to the new debug info format.
    
    ---------
    
    Signed-off-by: Victor Perez <victor.perez@codeplay.com>
    victor-eds authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    f910a4c View commit details
    Browse the repository at this point in the history
  3. [UR] bump tag to f11823e1 (intel#12721)

    oneapi-src/unified-runtime#1343
    
    ---------
    
    Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
    igchor and kbenzie authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    62a0010 View commit details
    Browse the repository at this point in the history
  4. [SYCL][Matrix] Add joint matrix query for CUDA and HIP backends (inte…

    …l#12075)
    
    This PR adds joint matrix query for CUDA and HIP backends as described
    in
    [sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc#query-interface)
    
    ---------
    
    Co-authored-by: Konrad Kusiak <konradk@login01.chn>
    konradkusiak97 and Konrad Kusiak authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    00eebe1 View commit details
    Browse the repository at this point in the history
  5. [CUDA][HIP][TEST-E2E] Include the necessary environment paths during …

    …the test-e2e build for CUDA and HIP backends. (intel#12606)
    
    Include the necessary environment paths during the test-e2e build for
    `CUDA` and `HIP` backends.
    The absence of the added path leads to the inability to locate libdevice
    for specific architectures, resulting in a failure.
    Below is the reported error when expected `CUDA_PATH` is missing
    `
    clang++: error: cannot find libdevice for `sm_50`; provide path to
    different `CUDA` installation via '--cuda-path', or pass '-nocudalib' to
    build without linking with libdevice
    `
    mmoadeli authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    6b8792c View commit details
    Browse the repository at this point in the history
  6. [SYCL] fix for syclcompat test on Windows (intel#12696)

    -shared flag is a clang/linux option. On Windows we need to be cognizant
    of possibly using MSVC compatible driver (e.g. icx) Needs `/clang`
    passthrough when using non MSVC options
    cperkinsintel authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    3f445cf View commit details
    Browse the repository at this point in the history
  7. [SYCL] Revert friend changes to assignment and incr/decr for swizzles (

    …intel#12682)
    
    This commit does a partial revert of
    intel#12396. This is to avoid an issue
    where the new friend operators wouldn't accept the arguments as l-value
    references.
    
    ---------
    
    Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
    steffenlarsen authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    6194f3c View commit details
    Browse the repository at this point in the history
  8. [ESIMD] Fix atomic_update() implementation for N=16 and N=32 on Gen12 (

    …intel#12722)
    
    atomic_update() for USM and ACC N=16,32 were lowered to SVM/DWORD atomic
    intrinsics even though the HW instructions on Gen12 supported only
    N up to 8 for USM and up to 16 for ACC.
        
    GPU had legalization pass for N that split longer vectors to smaller and
    available in HW.
    That GPU optimization/legalization workes incorrectly for USM as it
    splits longer vectors assuming instruction is available for N=16 in case
    of USM, which is not correct.
        
    The patch here implements splitting of N=16 and N=32 cases for
    atomic_update(usm, ...) to N=8 vectors until GPU fixes the legalization
    for USM atomic_update.
    
    Signed-off-by: Klochkov, Vyacheslav N <vyacheslav.n.klochkov@intel.com>
    v-klochkov authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    44a74d0 View commit details
    Browse the repository at this point in the history
  9. Merge from 'main' to 'sycl-web' (107 commits)

    1> Add code in CodeGenAction.cpp
      Basic change add new field "const FileManager &FileMgr"
      Add new function ReloadModules
      Code change in function LinkInModules.
    
    2> revert "[DebugInfo][RemoveDIs] Turn on non-instrinsic debug-info by
       default.
    
      CONFLICT (modify/delete): clang/lib/CodeGen/BackendConsumer.h deleted in HEAD and modified in 6d4ffbd.  Version 6d4ffbd of clang/lib/CodeGen/BackendConsumer.h left in tree.
      CONFLICT (content): Merge conflict in clang/lib/CodeGen/CodeGenAction.cpp
      CONFLICT (modify/delete): clang/lib/CodeGen/LinkInModulesPass.cpp deleted in HEAD and modified in 6d4ffbd.  Version 6d4ffbd of clang/lib/CodeGen/LinkInModulesPass.cpp left in tree.
    jyu2-git committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    4565039 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    a617aad View commit details
    Browse the repository at this point in the history
  11. Remove internal values for SPV_INTEL_cache_controls (intel#2346)

    The Headers for this extension were published so we should use them
    instead:
    KhronosGroup/SPIRV-Headers@a8af2ce
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@95d70a9ab4077ed
    vmaksimo authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    6695b8a View commit details
    Browse the repository at this point in the history
  12. Fix SPIR-V consumption of DebugInfoNone for debug types (intel#2341)

    OpenCL and NonSemantic DebugInfo specifications are flexible in terms of allowing any debug information be replaced with DebugInfoNone, so various of SPIR-V producers follow that and generate it for base types of several debug instructions, leaving SPIR-V consumers to handle this. By default the translator replaces missing debug info with tag: null, which is in most cases correct. Yet, there are situations, where it's not allowed by both LLVM and DWARF, for example for DW_TAG_array_type DWARF spec sets, that DW_AT_type attribute is mandatory. For such cases new transNonNullDebugType wrapper function was added to the translator, generating "DIBasicType(tag: DW_TAG_unspecified_type, name: "SPIRV unknown type")" where DebugInfoNone was used as the type. This function doesn't replace all calls to transDebugInst<DIType> as there are cases, where we can generate null type, for example DWARF doesn't require it for DW_TAG_typedef, hence I'm not changing translation flow in this case. Additionally to this, while DWARF requires type attribute for DW_TAG_pointer_type, LLVM does not, hence I'm not changing translation flow in this case as well.
    
    Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@ec023805a0ce26f
    MrSidims authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    64cefa5 View commit details
    Browse the repository at this point in the history
  13. Fix DebugTypeVector test (intel#2347)

    It should have tested DebugInfoNone base type
    
    Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@e0aef72fee42e0a
    MrSidims authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    ae1d570 View commit details
    Browse the repository at this point in the history
  14. Map to unordered_map for SPIRVIdToEntryMap (intel#2348)

    Small fix but yields around 30% speedup for translation
    SPIR-V to IR.
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@513b9578d310282
    bwlodarcz authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    f7b658f View commit details
    Browse the repository at this point in the history
  15. Fix BufferLocationINTEL decoration translation (intel#2335)

    There was an assumption, that ptr.annotation encoding buffer_location
    should be used by load or store instructions. But there is no such
    restriction in the specification.
    
    Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@7a37ea920f730e0
    MrSidims authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    d18a70f View commit details
    Browse the repository at this point in the history
  16. Prepare for non-instrinsic debug info (intel#2362)

    For now just convert BB with convertFromNewDbgValues, will
    figure out something smarter a bit later.
    
    I've updated several tests with dbg.declare intrinsic
    adding --experimental-debuginfo-iterators=1 to check if it works.
    
    Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@0e87aefecf7c500
    MrSidims authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    272ba9e View commit details
    Browse the repository at this point in the history
  17. Fix allowed types for OpConstantNull (intel#2361)

    The SPIR-V Specification allows `OpConstantNull` types to be scalar or
    vector booleans, integers, or floats.  Update an assert for this and
    add a SPIR-V -> LLVM IR test.
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@9ec969c1c379bde
    svenvh authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    55a143b View commit details
    Browse the repository at this point in the history
  18. Map FPFastMathModeINTEL on SPV_INTEL_fp_fast_math_mode (intel#2360)

    Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
    
    Original commit:
    KhronosGroup/SPIRV-LLVM-Translator@262395da9234fe4
    MrSidims authored and sys-ce-bb committed Feb 15, 2024
    Configuration menu
    Copy the full SHA
    339c1c6 View commit details
    Browse the repository at this point in the history
  19. [SYCL] Fix malloc shared by throwing when usm_shared_allocations not …

    …supported (intel#12700)
    
    Final PR in the series of intel#12636.
    Refer to it for a description.
    After a discussion with @AlexeySachkov we've decided its best to not
    rewrite USM and syclcompat tests with buffers/accessors. For USM, the
    reason is obvious and for syclcompat you can reach out to Alexey.
    Therefore, these tests are handled using if statements or requring
    aspect to be supported.
    Once this PR is merged, the behavior of malloc_shared will be to throw
    if the usm_shared_allocations is not supported which is conformant with
    the spec.
    lbushi25 authored Feb 15, 2024
    Configuration menu
    Copy the full SHA
    1bec982 View commit details
    Browse the repository at this point in the history
  20. Configuration menu
    Copy the full SHA
    90bcc32 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    1c223e1 View commit details
    Browse the repository at this point in the history

Commits on Feb 16, 2024

  1. Bump cryptography from 41.0.6 to 42.0.0 in llvm/utils/git/requirement…

    …s.txt (intel#12714)
    
    Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.6
    to 42.0.0 to resolve identified security vulnerability in 3rd party
    dependency.
    
    Refer to [cryptography's
    changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst).
    lucyli-ca authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    746ed9f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0c74e16 View commit details
    Browse the repository at this point in the history
  3. [SYCL] add overlooked default context test . (intel#12728)

    despite having a unit test for default context, realized there is not
    one to affirm the new default configuration.
    cperkinsintel authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    79d775e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    aa015f3 View commit details
    Browse the repository at this point in the history
  5. [SYCL][Graph] Clean-up E2E Tests (intel#12685)

    Some clean-up for SYCL-Graph E2E tests:
    * Remove redundant `Event` variables that are initialized over loop
    iterations but never used.
    * Remove all instances of the no immediate command-list property, and
    use environment variable instead to test both paths.
    * Always use FileCheck leak checking rather than `CHECK-NOT: Leak`.
    * Remove unnecessary threading code from `Inputs/basic_usm.cpp`
    EwanC authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    d747667 View commit details
    Browse the repository at this point in the history
  6. [ESIMD] Enable -fsycl-esimd-force-stateless-mem by default (intel#9452)

    Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com
    v-klochkov authored Feb 16, 2024
    Configuration menu
    Copy the full SHA
    f316273 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    27c9546 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    76bbf93 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    1a98c4c View commit details
    Browse the repository at this point in the history

Commits on Feb 19, 2024

  1. [SYCL][Graph] Avoid unnecessary inter-partition dependencies (intel#1…

    …2680)
    
    Improves management of inter-partition dependencies, so that only
    required dependencies are added.
    As removing these dependencies can results in multiple executions paths,
    we have added a map to track all events returned from submitted
    partitions.
    All these events are linked to the main event returned to user. 
    Adds tests.
    mfrancepillois authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    54a67eb View commit details
    Browse the repository at this point in the history
  2. [SYCL][Bindless] Fix Grad flag (intel#12729)

    Grad flag was set to 0x3 (meaning Lod + Bias) instead of 0x4.
    See
    https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#Image_Operands
    
    Signed-off-by: Victor Lomuller <victor@codeplay.com>
    Naghasan authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    f614781 View commit details
    Browse the repository at this point in the history
  3. UR fix for MaxRegsPerBlock check in setKernelParams (intel#12549)

    Bring the fix for MaxRegsPerBlock check from
    oneapi-src/unified-runtime#1299 to `intel/llvm`.
    No changes needed other than updating the UR repo hash.
    
    ---------
    
    Co-authored-by: Kenneth Benzie (Benie) <k.benzie@codeplay.com>
    rafbiels and kbenzie authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    5310b20 View commit details
    Browse the repository at this point in the history
  4. Fix a leak in pi_unified_runtime.cpp. (intel#12589)

    `LoaderConfig` is created and stored in a local pointer and never
    released when done using, causing it to be leaked.
    This patch releases the `LoaderConfig` when finished using it.
    yingcong-wu authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    d697024 View commit details
    Browse the repository at this point in the history
  5. [NFC][SYCL] Move a helper to its single legacy use (intel#12740)

    Old builtins implementation is going to be removed in the next ABI
    breaking window and that helper is only used there.
    aelovikov-intel authored Feb 19, 2024
    Configuration menu
    Copy the full SHA
    8293a5c View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2024

  1. Configuration menu
    Copy the full SHA
    b36cdd1 View commit details
    Browse the repository at this point in the history
  2. Bump cryptography from 42.0.0 to 42.0.2 in /llvm/utils/git (intel#12746)

    Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.0
    to 42.0.2.
    <details>
    <summary>Changelog</summary>
    <p><em>Sourced from <a
    href="https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst">cryptography's
    changelog</a>.</em></p>
    <blockquote>
    <p>42.0.2 - 2024-01-30</p>
    <pre><code>
    * Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL
    3.2.1.
    * Fixed an issue that prevented the use of Python buffer protocol
    objects in
      ``sign`` and ``verify`` methods on asymmetric keys.
    * Fixed an issue with incorrect keyword-argument naming with
    ``EllipticCurvePrivateKey``
    
    :meth:`~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePrivateKey.exchange`,
      ``X25519PrivateKey``
    
    :meth:`~cryptography.hazmat.primitives.asymmetric.x25519.X25519PrivateKey.exchange`,
      ``X448PrivateKey``
    
    :meth:`~cryptography.hazmat.primitives.asymmetric.x448.X448PrivateKey.exchange`,
      and ``DHPrivateKey``
    
    :meth:`~cryptography.hazmat.primitives.asymmetric.dh.DHPrivateKey.exchange`.
    <p>.. _v42-0-1:</p>
    <p>42.0.1 - 2024-01-24
    </code></pre></p>
    <ul>
    <li>Fixed an issue with incorrect keyword-argument naming with
    <code>EllipticCurvePrivateKey</code>
    
    :meth:<code>~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePrivateKey.sign</code>.</li>
    <li>Resolved compatibility issue with loading certain RSA public keys in
    
    :func:<code>~cryptography.hazmat.primitives.serialization.load_pem_public_key</code>.</li>
    </ul>
    <p>.. _v42-0-0:</p>
    </blockquote>
    </details>
    <details>
    <summary>Commits</summary>
    <ul>
    <li><a
    href="https://github.com/pyca/cryptography/commit/2202123b50de1b8788f909a3e5afe350c56ad81e"><code>2202123</code></a>
    changelog and version bump 42.0.2 (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10268">#10268</a>)</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/f7032bdd409838f67fc2b93343f897fb5f397d80"><code>f7032bd</code></a>
    bump openssl in CI (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10298">#10298</a>)
    (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10299">#10299</a>)</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/002e886f16d8857151c09b11dc86b35f2ac9aec3"><code>002e886</code></a>
    Fixes <a
    href="https://redirect.github.com/pyca/cryptography/issues/10294">#10294</a>
    -- correct accidental change to exchange kwarg (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10295">#10295</a>)
    (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10296">#10296</a>)</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/92fa9f2f606caea5d499c825e832be5bac6f0c23"><code>92fa9f2</code></a>
    support bytes-like consistently across our asym sign/verify APIs (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10260">#10260</a>)
    (<a
    href="https://redirect.github.com/pyca/cryptography/issues/1">#1</a>...</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/6478f7e28be54b51931277235de01b249ceabd96"><code>6478f7e</code></a>
    explicitly support bytes-like for signature/data in RSA sign/verify (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10259">#10259</a>)
    ...</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/4bb8596ae02d95bb054dbcf55e8771379dbe0c19"><code>4bb8596</code></a>
    fix the release script (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10233">#10233</a>)
    (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10254">#10254</a>)</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/337437dc2e62772bde4ad5544f4b1db9ee7572d9"><code>337437d</code></a>
    42.0.1 bump (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10252">#10252</a>)</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/56255de6b2d1a2d2e502b0275231ca81907f33f1"><code>56255de</code></a>
    allow SPKI RSA keys to be parsed even if they have an incorrect
    delimiter (<a
    href="https://redirect.github.com/pyca/cryptography/issues/1">#1</a>...</li>
    <li><a
    href="https://github.com/pyca/cryptography/commit/12f038b38af76e36efe8cef09597010c97647e8f"><code>12f038b</code></a>
    fixes <a
    href="https://redirect.github.com/pyca/cryptography/issues/10237">#10237</a>
    -- correct EC sign parameter name (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10239">#10239</a>)
    (<a
    href="https://redirect.github.com/pyca/cryptography/issues/10240">#10240</a>)</li>
    <li>See full diff in <a
    href="https://github.com/pyca/cryptography/compare/42.0.0...42.0.2">compare
    view</a></li>
    </ul>
    </details>
    <br />
    
    
    [![Dependabot compatibility
    score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=cryptography&package-manager=pip&previous-version=42.0.0&new-version=42.0.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
    
    Dependabot will resolve any conflicts with this PR as long as you don't
    alter it yourself. You can also trigger a rebase manually by commenting
    `@dependabot rebase`.
    
    [//]: # (dependabot-automerge-start)
    [//]: # (dependabot-automerge-end)
    
    ---
    
    <details>
    <summary>Dependabot commands and options</summary>
    <br />
    
    You can trigger Dependabot actions by commenting on this PR:
    - `@dependabot rebase` will rebase this PR
    - `@dependabot recreate` will recreate this PR, overwriting any edits
    that have been made to it
    - `@dependabot merge` will merge this PR after your CI passes on it
    - `@dependabot squash and merge` will squash and merge this PR after
    your CI passes on it
    - `@dependabot cancel merge` will cancel a previously requested merge
    and block automerging
    - `@dependabot reopen` will reopen this PR if it is closed
    - `@dependabot close` will close this PR and stop Dependabot recreating
    it. You can achieve the same result by closing it manually
    - `@dependabot show <dependency name> ignore conditions` will show all
    of the ignore conditions of the specified dependency
    - `@dependabot ignore this major version` will close this PR and stop
    Dependabot creating any more for this major version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this minor version` will close this PR and stop
    Dependabot creating any more for this minor version (unless you reopen
    the PR or upgrade to it yourself)
    - `@dependabot ignore this dependency` will close this PR and stop
    Dependabot creating any more for this dependency (unless you reopen the
    PR or upgrade to it yourself)
    You can disable automated security fix PRs for this repo from the
    [Security Alerts page](https://github.com/intel/llvm/network/alerts).
    
    </details>
    
    ---------
    
    Signed-off-by: dependabot[bot] <support@github.com>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    Co-authored-by: Alexey Bader <alexey.bader@intel.com>
    dependabot[bot] and bader authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    5fae0aa View commit details
    Browse the repository at this point in the history
  3. [ESIMD][NFC][E2E] Fix 570 compilation warnings in ESIMD E2E tests (in…

    …tel#12748)
    
    Warnings fixed:
    - deprecated scatter_rgba
    - deprecated get_cl_code
    - deprecated lsc_fence
    - deprecated uchar type usage
    - deprecated get_access on HOST
    - deprecated get_pointer
    - usage of isfinite with -ffast-math
    - deprecated dpas_argument_type::s1
    - deprecated gpu_selector()
    
    Also, the memory alloc/free in historgram*.cpp tests were updated to
    simplify the potential memory leak avoidance.
    
    Signed-off-by: Klochkov, Vyacheslav N <vyacheslav.n.klochkov@intel.com>
    v-klochkov authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    436e687 View commit details
    Browse the repository at this point in the history
  4. [GHA] Uplift Linux GPU RT version to 24.05.28454.6 (intel#12764)

    Scheduled drivers uplift
    
    Co-authored-by: GitHub Actions <actions@github.com>
    bb-sycl and actions-user authored Feb 20, 2024
    Configuration menu
    Copy the full SHA
    6863dfc View commit details
    Browse the repository at this point in the history
  5. [SYCL][Graph] Update doc for UR PR moving reset commands to a dedicat…

    …ed cmd-list
    
    Update the design doc.
    Update the UR tag.
    mfrancepillois committed Feb 20, 2024
    Configuration menu
    Copy the full SHA
    8e21a1d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ed730fe View commit details
    Browse the repository at this point in the history