Skip to content

[AutoBump] Merge with 61ad0879 (Feb 14) (46) #590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 444 commits into
base: bump_to_20ae283d
Choose a base branch
from

Conversation

jorickert
Copy link

No description provided.

chsigg and others added 30 commits February 13, 2025 08:08
And fix some typos in comments.

In the future, we may add more scheduling info to GenericModel.
…vm#126475)

[mlir][affine]make affine-loop-unroll a FunctionOpInterface pass

Make `affine-loop-unroll` a `FunctionOpInterface` pass.Now unroll can be
done on gpu.func.
This is a followup to llvm#126745,
generalizing it to always use TargetFolder, including inside function
bodies.

This avoids generating non-canonical constant expressions that can be
folded away.
This is to fix compile error with explicit Clang modules like
```
../../third_party/libc++/src/include/__vector/vector_bool.h:85:11: error: default argument of '__bit_iterator' must be imported from module 'std.bit_reference_fwd' before it is required
   85 |   typedef __bit_iterator<vector, false> pointer;
      |           ^
../../third_party/libc++/src/include/__fwd/bit_reference.h:23:68: note: default argument declared here is not reachable
   23 | template <class _Cp, bool _IsConst, typename _Cp::__storage_type = 0>
      |                                                                    ^
```
This is to fix compile error with explicit Clang modules like
```
../../third_party/libc++/src/include/__filesystem/path.h:80:26: error: declaration of '__enable_if_t' must be imported from module 'std_core.type_traits.enable_if' before it is required
   80 | template <class _ECharT, __enable_if_t<__can_convert_char<_ECharT>::value, int> = 0>
      |                          ^
../../third_party/libc++/src/include/__type_traits/enable_if.h:34:1: note: declaration here is not visible
   34 | using __enable_if_t _LIBCPP_NODEBUG = typename enable_if<_Bp, _Tp>::type;
      | ^
```
…ity wrappers (llvm#126902)

The llvm versions of these functions do that, so we must to so as well.
Practically this meant that were were unable to correctly un-simplify
the names of some types when using type units, which resulted in type
lookup errors.
Update test that require flang-supports-f128-math after llvm#126929.
…5880)

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
)

Heuristic was removed in 46533e6 due to being ineffective.
In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: llvm#20521 [1]
The `-buffer-deallocation` pass is not compatible with One-Shot
Bufferize and has been replaced with the Ownership-based Buffer
Deallocation pass about 1.5 years ago. To clean up the code base, this
commit removes the deprecated `buffer-deallocation` pass. All uses of
this deprecated pass within MLIR have already been migrated.

Note for LLVM integration: If you depend on this pass, migrate to the
Ownership-based Buffer Deallocation pass or copy the pass to your
codebase. For details, see
https://discourse.llvm.org/t/psa-bufferization-new-buffer-deallocation-pipeline/73375.
…nctions (llvm#124931)

This is a followup to llvm#122440, which changed function-relative
calculations to use the function entry point rather than the lowest
address of the function (but missed this usage). Like in llvm#116777, the
logic is changed to use file addresses instead of section offsets (as
not all parts of the function have to be in the same section).
AAPCS32 defines the fp16 and bf16 types as being passed as if they were
extended to 32 bits, with the high 16 bits being unspecified. The
extension is specified as happening as-if it was done in a register,
which means that for big endian targets, the actual value gets passed in
the higher addressed half of the stack slot, instead of the lower
addressed half as for little endian. Previously, for targets with the
fp16 extension, we were passing these types as a 16 bit stack slot,
which worked for little endian because every later stack slot would be
4-byte aligned leaving the 2 byte gap, but was incorrect for big endian.
This is a (very belated) reland of 0a362f1,
which I originally reverted due to flang test failures.

This marks mul constant expressions as undesirable, which means that
we will no longer create them by default, but they can still be
created explicitly.

Part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
Removes MnemonicAliases added for instructions available with
the LSUI feature (e.g. CAS -> CAST) which are not equivalent.
The aliases stt[add|clr|set]a & stt[add|clr|set]al are also removed.
Fix an issue introduced in llvm#126929: The LLVM module is moved into
the ModuleTranslator, so query the DataLayout from there.
)

This patch adds a new API to `SBType` to retrieve the value of a
template parameter given an index. We re-use the
`TypeSystemClang::GetIntegralTemplateArgument` for this and thus
currently only supports integral non-type template parameters. Types
like float/double are not supported yet.

rdar://144395216
…7034)

I had previous attempts for fixing this flaky test. Let's admit I failed
so far, and disable this until we have a permanent fix.

See the discussion at:
llvm#126913 (comment)
When bytes with negative signed char values appear in the data, make
sure to use raw bytes from the data string when preprocessing, not char
values.

Fixes llvm#102798
…embling (llvm#126925)

We need to iterate through the all symbol context ranges returned by
(since llvm#126505) SymbolContext::GetAddressRange. This also includes a fix
to print the function offsets as signed values.

I've also wanted to check that the addresses which are in the middle of
the function do *not* resolve to the function, but that's not entirely
the case right now. This appears to be a separate issue though, so I've
just left a TODO for now.
…llvm#126967)

After llvm#124287 updated several functions to return iterators rather than
Instruction *, it was no longer straightforward to pass their result to
DIBuilder. This commit updates DIBuilder methods to accept an
InsertPosition instead, so that they can be called with an iterator
(preferred), or with a deprecation warning an Instruction *, or a
BasicBlock *. This commit also updates the existing calls to the
DIBuilder methods to pass in iterators.

As a special exception, DIBuilder::insertDeclare() keeps a separate
overload accepting a BasicBlock *InsertAtEnd. This is because despite
the name, this method does not insert at the end of the block, therefore
this cannot be handled implicitly by using InsertPosition.
Model C/C++ `errno` macro by adding a corresponding `errno`
memory location kind to the IR. Preliminary work to separate
`errno` writes from other memory accesses, to the benefit of
alias analyses and optimization correctness.

Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
teresajohnson and others added 30 commits February 14, 2025 14:12
The implicit-check-not had a typo which meant it didn't fail as expected
when I tested better recursion handling. Fix that here (no change for
current head).
…#123237)

Unifies test function names so that it's easier to identify what
different cases are. Also improves consistency. The following naming
scheme has been adopted:
* `@xfer_{read|write}_{map_type}_{masked|with_mask|}_{out_of_bounds}_{scalable}`

Also updated some comments to better document the patterns that are
being exercised.
…paqueValueExpr in trivial expressions (llvm#127182)

Allow VisitArrayInitLoopExpr, VisitArrayInitIndexExpr, and
VisitOpaqueValueExpr in trivial functions and statements.
This patch changes the name of the MacOS premerge job from
permerge-checks-macos to (the presumably correct) premerge-checks-macos.
This allows a sort of "include" mechanism in the YAML files.  A
file can have a "merge_yaml_files" list of paths (relative to the
containing file's location).  These are YAML files in the same
syntax, except they cannot have their own "header" entry.  Only
the lists (types, enums, macros, functions, objects) can appear.
The main YAML file is then processed just as if each of its lists
were the (sorted) union of each YAML file's corresponding list.

This will enable maintaining a single source of truth for each
function signature and other such details, where it is necessary
to generate the same declaration in more than one header.
Fixing error detected in build bot in file
`RootSignature-MultipleEntryFunctions.ll`

closes: [127260](llvm#127260)

---------

Co-authored-by: joaosaffran <[email protected]>
Implements the posix-specified strftime conversions for the default
locale, along with comprehensive unit tests. This reuses a lot of design
from printf, as well as the printf writer.

Roughly based on llvm#111305, but with major rewrites.
Forgot to change a size_t to an int, which caused warnings on gcc but
not clang for some reason. Regardless, this patch fixes the issue.
…s pattern (llvm#126443)

In WebKit, it's pretty common to capture "this" and "protectedThis"
where "protectedThis" is a guardian variable of type Ref or RefPtr for
"this". Furthermore, it's common for this "protectedThis" variable from
being passed to an inner lambda by std::move. Recognize this pattern so
that we don't emit warnings for nested inner lambdas.

To recognize this pattern, we introduce a new DenseSet,
ProtectedThisDecls, which contains every "protectedThis" we've
recognized to our subclass of DynamicRecursiveASTVisitor. This set is
now populated in "hasProtectedThis" and "declProtectsThis" uses this
DenseSet to determine a given value declaration constitutes a
"protectedThis" pattern or not.

Because hasProtectedThis and declProtectsThis had to be moved from the
checker class to the visitor class, it's now a responsibility of each
caller of visitLambdaExpr to check whether a given lambda captures
"this" without a "protectedThis" or not.

Finally, this PR improves the code to recognize "protectedThis" pattern
by allowing more nested CXXBindTemporaryExpr, CXXOperatorCallExpr, and
UnaryOperator expressions.
…4834)

Following up llvm#72078, on x86-64 this allows a global to be considered
small or large regardless of the code model. For example, x86-64's
medium code model by default classifies globals as small or large
depending on their size relative to -mlarge-data-threshold.

GPU compilations compile the same TU for both the host and device, but
only codegen the host or device portions of it depending on attributes.
However, we still Sema the TU, and will warn on an unknown attribute for
the device compilation since this attribute is target-specific. Since
they're intended for the host, accept but ignore this attribute for
device compilations where the host is either unknown or known to
support the attribute.

Co-authored-by: @pranavk
…127278)

This uses the new merge_yaml_files feature in hdrgen to share the
source of truth for the malloc suite of functions declared in
both stdlib.h and in malloc.h (without either header including
the other).  It also modernizes the malloc.yaml definition a bit,
including dropping the custom template malloc.h.def file in favor
of using the explicit macros list to generate the includes.
…llvm#127108)

If we have a shuffle which repeats the same pattern of elements, all of
which come from the first register in the source register group, we can
lower this to a single vrgather at m1 to perform the element
rearrangement, and reuse that for each register in the result vector
register group.
When the return type's rendering already doesn't end with an
identifier character, such as when it's `T *`, then idiomatic
syntax does not include a space before the `(` and arguments.
Support relaxation optimization for two types of code sequences.
```
From:
   pcalau12i $a0, %pc_hi20(sym)
       R_LARCH_PCALA_HI20, R_LARCH_RELAX
   addi.w/d $a0, $a0, %pc_lo12(sym)
       R_LARCH_PCALA_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %pc_lo12(sym)
       R_LARCH_PCREL20_S2
    
From:
   pcalau12i $a0, %got_pc_hi20(sym_got)
       R_LARCH_GOT_PC_HI20, R_LARCH_RELAX
   ld.w/d $a0, $a0, %got_pc_lo12(sym_got)
       R_LARCH_GOT_PC_LO12, R_LARCH_RELAX
To:
   pcaddi $a0, %got_pc_hi20(sym_got)
       R_LARCH_PCREL20_S2
```
Others:
- `loongarch-relax-pc-hi20-lo12-got-symbols.s` is inspired by
`aarch64-adrp-ldr-got-symbols.s`.

Co-authored-by: Xin Wang
[[email protected]](mailto:[email protected])
For the purpose of determining triviality, ignore all attributes.
…lvm#126999)

Adds a macro definition `MLIR_USE_FALLBACK_TYPE_IDS`. When this is
defined, the `MLIR_{DECLARE,DEFINE}_EXPLICIT_TYPE_ID` functions
explicitly fall back to string comparison.

This is useful for complex shared library setups
where it may be difficult to agree on a source of truth for specific
type ID resolution. As long as there is a single resolution for
`registerImplicitTypeID`, all type IDs can be reference a shared
registration. This way types which are logically shared across multiple
DSOs can have the same type ID, even if their definitions are
duplicated.
…per bound in certain cases (llvm#127192)

Fix `FlatLinearValueConstraints::getSliceBounds` for missing checks on
no
lower/upper bound bound. Obvious bug.

Fixes: llvm#119525
Fixes: llvm#108374
…nfo (NFC)" (llvm#127277)

Originally landed in llvm#126800

This version fixes a typo in NVPTXAsmPrinter::emitFunctionParamList
where .surfref was erroneously replaced with .samplerref.
…ndTrunc. (llvm#127258)

Put the bitcast before the insert_subvector. It's more likely the insert
subvector can be combined with other nodes. The insert_subvector is only
needed sometimes, and I'm considering reusing this function which might
require pulling the insert_subvector out.
This patch makes it so that the metrics container counts the number of in
progress and queued jobs at the job level rather than at the workflow
level. This helps us distinguish windows versus linux load and also lets
us filter out the MacOS jobs that only run in the release branch.

Reviewers: Keenuts, lnihlen

Reviewed By: lnihlen

Pull Request: llvm#127274
This patch removes an extra heartbeat metric in the metrics python file. Before
it was performed twice, once in the main function, and once in the
get_sampled_workflow_metrics function. We only need one to keep everything
happy, and I've chosen to keep the one in get_sampled_workflow_metrics as it
seems a more appropriate place to keep it.

Reviewers: Keenuts, lnihlen

Reviewed By: lnihlen

Pull Request: llvm#127275
The method was meant to be overriden by subclasses only.
It should not be called directly by users
Currently the metrics container is crashing reasonably often with
incomplete read/connection broken errors. Try moving the creation of the
Github Object into the main loop to see if recreating the object that
maybe handles some connection state fixes the issue.

Reviewers: Keenuts, lnihlen

Reviewed By: lnihlen

Pull Request: llvm#127276
This patch makes it so that skipped steps do not cause a job to be
considered failed. The windows premerge jobs currently skip the
build/test step if there are no projects to build/test. These show up as
failures in the dashboard even though everything executed perfectly
fine.

Reviewers: lnihlen, Keenuts

Reviewed By: lnihlen

Pull Request: llvm#127279
…getSingleShuffleSrc. (llvm#127250)

I think this dates to a time when we used to use a type twice as large
as necessary for the input to the vnsrl. This was changed in llvm#118509
when factor 4 and 8 were added.

The existing test for this regresses because it uses a lot of undef
elements and we previously figured out we could reduce its size and then
try the vnsrl again. We now match it before we try to reduce the width
so we miss this opportunity.

I've added a second test that doesn't have any undef elements in the
first half. Prior to this patch we used a vcompress lowering instead of
vnsrl.
… in MinGW executables" (llvm#127297)

Reverts llvm#107375

This was causing a build bot failure
(https://lab.llvm.org/buildbot/#/builders/201/builds/2954) and also
breaks building with VS2019. See
llvm#107375 (comment)
for details.
…sion utils (llvm#127164)

Fix/improve debug messages and API signatures for affine
analysis/fusion utils.

Move some warnings under LLVM_DEBUG. These weren't meant to be exposed
during compilation.

Add dump pretty methods for FlatLinearConstraints.

NFC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment