Skip to content

[AutoBump] Merge with e0e67a62 (Feb 17) (48) #592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 110 commits into
base: bump_to_5c93eb56
Choose a base branch
from

Conversation

jorickert
Copy link

No description provided.

owenca and others added 30 commits February 14, 2025 21:10
…27114)"

This reverts commit f7a2d70.

Multiple buildbot failures have been reported.  See:
llvm#127114
For some operating systems (e.g. chromiumos), terminfo is a separate
package and library from ncurses. Both are still requirements for curses
support in lldb, individually.
    
This is a rework of this original spack commit:

spack/spack@9ea2612

Instead though, this PR uses CMake to detect whether the symbol is
present and defined in the curses library, and only falls back to a separate
tinfo if not found.
    
Without this fix, LLDB cannot be built on these systems.
    
Fixes llvm#101368
The metrics script includes some logic to only read look at workflows up
to the most recent workflow it has seen previously. This was broken in a
previous patch when workflow metrics began to be emitted per job. The
logic ending the metrics gathering would never trigger, so we would
continually fetch more and more workflows until OOM.
 - Remove _ap (auto_ptr) suffix with _up (unique_ptr) suffix
 - Move forward declaration from IOHandler.h to IOHandlerCursesGUI.h
 - Move curses namespace under lldb_private

Motivated by Alex' comment in llvm#126630.
1. Add a new `MLIR_DEPS` argument group to `flang_add_library()`, and
move MLIR-specific dependencies to that group. These dependencies are
added as usual in regular builds, and are skipped in standalone builds,
since MLIR targets are not visible there (and were already built and
installed).
2. Fix the value of `MLIR_MAIN_SRC_DIR` to refer to the current source
directory rather than the directory written into MLIR CMake files. The
latter refers to the directory used to build the MLIR package, and is no
longer valid.
3. Fix non-dylib friendly linking of `LLVMTargetParser` in `Optimizer`
unittests.

With these changes, I can successfully run Flang's regression tests.
We've documented the preferred `enable_if` style in the coding
guidelines. This updates `<optional>` to conform to them
…5587)

This partially reverts commit 5f2389d. That commit started checking
whether <features.h> was a valid include unconditionally, however codebases
are free to have such a header on their search path, which breaks compilation.
LLVM libc now provides a more standard way of getting configuration macros
like __LLVM_LIBC__.

After this patch, we only include <features.h> when we're on Linux or
when we're compiling for GPUs.
…lvm#123924)

Apparently trying to lookup a function pointer using the C api
`mlirExecutionEngineLookup` will throw an assert instead of just
returning a nullptr on builds with asserts.

The docs itself says it returns a nullptr when no function is found so
it should be sensible to not throw an assert in this case.
…TE(MASK),CONCAT(LO,HI)) (llvm#127199)

We already handle the simpler VPERMV3(LO,MASK,HI) fold which can reuse
the (widened) mask, this attempts to match the flipped concatenation,
and commutes the mask to handle the flip.

I've limited this to cases where we can extract the constant mask for
commutation, a more general solution would XOR the MSB of the shuffle
mask indices to commute, but this almost never constant folds away after
lowering so the benefit was minimal.
…vm#127313)

We have some benchmarks that were benchmarking very specific
functionality, namely the optimizations in vector<bool>::iterator. Call
this out in the benchmarks by renaming them appropriately. In the future
we will also increase the coverage of these benchmarks to test other
containers.
- After llvm#97727 and llvm#101652, `LowerConstantIntrinsics` and
  `ExpandVectorPredicationPass` are no longer dedicated passes.
Use getPointer/setPointer to clarify we are accessing/modifying the
rurrent value.
This adds patterns for v8i8->i16 vaddlv and v4i16->i32 vaddlv, for both signed
and unsigned extends.
Factor out some code that splits a ConstantRange into positive and
negative components, introducing ConstantRange::splitPosNeg.
This change is no longer necessary after llvm#125842. Thanks to @nikic for
letting me know.
Move up replaceSymbolicStrideSCEV before isNoWrap. It needs to be called
after hasComputableBounds, as this may create an AddRec via PSE, which
replaceSymbolicStrideSCEV will look up.

This is in preparation for simplifying isNoWrap.
…is (llvm#127309)

Add a missing nullptr check to declProtectsThis.
…urces (llvm#126219)

`__init(const value_type*, size_type, size_type)` is part of our ABI,
but we don't actually use the function anymore in the dylib. THis moves
the definition to the `src/` directory to make it clear that the code is
unused. This also allows us to remove it entirely in the unstable ABI.
topperc and others added 30 commits February 16, 2025 14:00
The mixed tensor/buffer semantics has been disallowed in llvm#80660. Closes
llvm#124090.
Use id() to get rid of some implicit conversions.
Refactor FlatLinearConstraints getSliceBounds. The method was too long
and nested. NFC.
The last use was removed in:

  commit ee97793
  Author: Richard Smith <[email protected]>
  Date:   Fri May 1 21:22:17 2015 +0000
We can use vnsrl+trunc on each source and concatenate the results
with vslideup.
    
For low LMUL it would be better to concat first, but I'm leaving
this for later.
…27405)

This used to cause certain std::range tests in libc++ to be diagnosed as
modifying a const-qualified field, because we set the IsConst flag to
true unconditionally. Check the type instead.
The lase use was removed in:

  commit cbf34a5
  Author: Juan Manuel Martinez Caamaño <[email protected]>
  Date:   Fri Aug 23 14:06:17 2024 +0200
We shouldn't abort here when compiling, this is happening (and properly
diagnosed) when interpreting the bytecode.
The last use was removed in:

  commit 05e6bb4
  Author: Roger Ferrer Ibáñez <[email protected]>
  Date:   Thu May 30 14:55:32 2024 +0200
…t patterns as well as 512-bit (llvm#127392)

The 512-bit filter was to prevent AVX1/2 regressions, but most of that is now handled by canonicalizeShuffleWithOp

Ideally we need to support smaller element widths as well.

Noticed while triaging llvm#116931
Two small stylistic improvements in code that I wrote ~a year ago:
1. fix a typo in a comment; and
2. simplify the code of `tryDividePair` by swapping the true and the
false branches.
When lowering EXTEND_VECTOR_INREG, check whether the operand is a
shuffle that is moving the top half of a vector into the lower half. If
so, we can EXTEND_HIGH the input to the shuffle instead.
Enable the vectorizer to access interleaved memory. This means that,
when it's decided to be profitable, the memory accesses can be
vectorized instead of the value being built up by a sequence of
load_lane instructions. This will often increase the vectorization
factor of the loop, leading to significantly better performance.

I run a reasonably large collection of benchmarks and most are not
affected by this change, with most performance changes <1%. But I see a
2.5% speedup for the total run time of TSVC, 1% speedup for SPEC2017
x265, 28% speedup for a ResNet workload and 95% for libyuv. This is
running V8 on an AArch64 box.
The last use was removed in:

  commit ac9e677
  Author: Yingwei Zheng <[email protected]>
  Date:   Mon Feb 26 01:53:16 2024 +0800
…usModules.rst"

This reverts commit 82dc2d4.

The fix has been reverted in f63e8ed
…llvm#109833)

This patch adds initial support for vectorizing literal struct return
values. Currently, this is limited to the case where the struct is
homogeneous (all elements have the same type) and not packed. The users
of the call also must all be `extractvalue` instructions.

The intended use case for this is vectorizing intrinsics such as:

```
declare { float, float } @llvm.sincos.f32(float %x)
```

Mapping them to structure-returning library calls such as:

```
declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>)
```

Or their widened form (such as `@llvm.sincos.v4f32` in this case).

Implementing this required two main changes:

1. Supporting widening `extractvalue`
2. Adding support for vectorized struct types in LV
  * This is mostly limited to parts of the cost model and scalarization

Since the supported use case is narrow, the required changes are
relatively small.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.