[AutoBump] Merge with 61ad0879 (Feb 14) (46) #590

jorickert · 2025-06-17T08:48:12Z

No description provided.

And fix some typos in comments. In the future, we may add more scheduling info to GenericModel.

…vm#126475) [mlir][affine]make affine-loop-unroll a FunctionOpInterface pass Make `affine-loop-unroll` a `FunctionOpInterface` pass.Now unroll can be done on gpu.func.

This is a followup to llvm#126745, generalizing it to always use TargetFolder, including inside function bodies. This avoids generating non-canonical constant expressions that can be folded away.

This is to fix compile error with explicit Clang modules like ``` ../../third_party/libc++/src/include/__vector/vector_bool.h:85:11: error: default argument of '__bit_iterator' must be imported from module 'std.bit_reference_fwd' before it is required 85 | typedef __bit_iterator<vector, false> pointer; | ^ ../../third_party/libc++/src/include/__fwd/bit_reference.h:23:68: note: default argument declared here is not reachable 23 | template <class _Cp, bool _IsConst, typename _Cp::__storage_type = 0> | ^ ```

This is to fix compile error with explicit Clang modules like ``` ../../third_party/libc++/src/include/__filesystem/path.h:80:26: error: declaration of '__enable_if_t' must be imported from module 'std_core.type_traits.enable_if' before it is required 80 | template <class _ECharT, __enable_if_t<__can_convert_char<_ECharT>::value, int> = 0> | ^ ../../third_party/libc++/src/include/__type_traits/enable_if.h:34:1: note: declaration here is not visible 34 | using __enable_if_t _LIBCPP_NODEBUG = typename enable_if<_Bp, _Tp>::type; | ^ ```

…ity wrappers (llvm#126902) The llvm versions of these functions do that, so we must to so as well. Practically this meant that were were unable to correctly un-simplify the names of some types when using type units, which resulted in type lookup errors.

Update test that require flang-supports-f128-math after llvm#126929.

…5880) This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.

) Heuristic was removed in 46533e6 due to being ineffective.

In 5235973, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973 to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 2920061, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: llvm#20521 [1]

The `-buffer-deallocation` pass is not compatible with One-Shot Bufferize and has been replaced with the Ownership-based Buffer Deallocation pass about 1.5 years ago. To clean up the code base, this commit removes the deprecated `buffer-deallocation` pass. All uses of this deprecated pass within MLIR have already been migrated. Note for LLVM integration: If you depend on this pass, migrate to the Ownership-based Buffer Deallocation pass or copy the pass to your codebase. For details, see https://discourse.llvm.org/t/psa-bufferization-new-buffer-deallocation-pipeline/73375.

…nctions (llvm#124931) This is a followup to llvm#122440, which changed function-relative calculations to use the function entry point rather than the lowest address of the function (but missed this usage). Like in llvm#116777, the logic is changed to use file addresses instead of section offsets (as not all parts of the function have to be in the same section).

AAPCS32 defines the fp16 and bf16 types as being passed as if they were extended to 32 bits, with the high 16 bits being unspecified. The extension is specified as happening as-if it was done in a register, which means that for big endian targets, the actual value gets passed in the higher addressed half of the stack slot, instead of the lower addressed half as for little endian. Previously, for targets with the fp16 extension, we were passing these types as a 16 bit stack slot, which worked for little endian because every later stack slot would be 4-byte aligned leaving the 2 byte gap, but was incorrect for big endian.

This is a (very belated) reland of 0a362f1, which I originally reverted due to flang test failures. This marks mul constant expressions as undesirable, which means that we will no longer create them by default, but they can still be created explicitly. Part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

Removes MnemonicAliases added for instructions available with the LSUI feature (e.g. CAS -> CAST) which are not equivalent. The aliases stt[add|clr|set]a & stt[add|clr|set]al are also removed.

…vm#125721) Signed-off-by: Victor Guerra <[email protected]>

Fix an issue introduced in llvm#126929: The LLVM module is moved into the ModuleTranslator, so query the DataLayout from there.

) This patch adds a new API to `SBType` to retrieve the value of a template parameter given an index. We re-use the `TypeSystemClang::GetIntegralTemplateArgument` for this and thus currently only supports integral non-type template parameters. Types like float/double are not supported yet. rdar://144395216

…7034) I had previous attempts for fixing this flaky test. Let's admit I failed so far, and disable this until we have a permanent fix. See the discussion at: llvm#126913 (comment)

When bytes with negative signed char values appear in the data, make sure to use raw bytes from the data string when preprocessing, not char values. Fixes llvm#102798

…embling (llvm#126925) We need to iterate through the all symbol context ranges returned by (since llvm#126505) SymbolContext::GetAddressRange. This also includes a fix to print the function offsets as signed values. I've also wanted to check that the addresses which are in the middle of the function do *not* resolve to the function, but that's not entirely the case right now. This appears to be a separate issue though, so I've just left a TODO for now.

…lvm#123636)

…llvm#126967) After llvm#124287 updated several functions to return iterators rather than Instruction *, it was no longer straightforward to pass their result to DIBuilder. This commit updates DIBuilder methods to accept an InsertPosition instead, so that they can be called with an iterator (preferred), or with a deprecation warning an Instruction *, or a BasicBlock *. This commit also updates the existing calls to the DIBuilder methods to pass in iterators. As a special exception, DIBuilder::insertDeclare() keeps a separate overload accepting a BasicBlock *InsertAtEnd. This is because despite the name, this method does not insert at the end of the block, therefore this cannot be handled implicitly by using InsertPosition.

…llvm#125656)

Model C/C++ `errno` macro by adding a corresponding `errno` memory location kind to the IR. Preliminary work to separate `errno` writes from other memory accesses, to the benefit of alias analyses and optimization correctness. Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.

The implicit-check-not had a typo which meant it didn't fail as expected when I tested better recursion handling. Fix that here (no change for current head).

…#123237) Unifies test function names so that it's easier to identify what different cases are. Also improves consistency. The following naming scheme has been adopted: * `@xfer_{read|write}_{map_type}_{masked|with_mask|}_{out_of_bounds}_{scalable}` Also updated some comments to better document the patterns that are being exercised.

…paqueValueExpr in trivial expressions (llvm#127182) Allow VisitArrayInitLoopExpr, VisitArrayInitIndexExpr, and VisitOpaqueValueExpr in trivial functions and statements.

This patch changes the name of the MacOS premerge job from permerge-checks-macos to (the presumably correct) premerge-checks-macos.

llvm#127207)

This allows a sort of "include" mechanism in the YAML files. A file can have a "merge_yaml_files" list of paths (relative to the containing file's location). These are YAML files in the same syntax, except they cannot have their own "header" entry. Only the lists (types, enums, macros, functions, objects) can appear. The main YAML file is then processed just as if each of its lists were the (sorted) union of each YAML file's corresponding list. This will enable maintaining a single source of truth for each function signature and other such details, where it is necessary to generate the same declaration in more than one header.

Fixing error detected in build bot in file `RootSignature-MultipleEntryFunctions.ll` closes: [127260](llvm#127260) --------- Co-authored-by: joaosaffran <[email protected]>

Implements the posix-specified strftime conversions for the default locale, along with comprehensive unit tests. This reuses a lot of design from printf, as well as the printf writer. Roughly based on llvm#111305, but with major rewrites.

Forgot to change a size_t to an int, which caused warnings on gcc but not clang for some reason. Regardless, this patch fixes the issue.

…s pattern (llvm#126443) In WebKit, it's pretty common to capture "this" and "protectedThis" where "protectedThis" is a guardian variable of type Ref or RefPtr for "this". Furthermore, it's common for this "protectedThis" variable from being passed to an inner lambda by std::move. Recognize this pattern so that we don't emit warnings for nested inner lambdas. To recognize this pattern, we introduce a new DenseSet, ProtectedThisDecls, which contains every "protectedThis" we've recognized to our subclass of DynamicRecursiveASTVisitor. This set is now populated in "hasProtectedThis" and "declProtectsThis" uses this DenseSet to determine a given value declaration constitutes a "protectedThis" pattern or not. Because hasProtectedThis and declProtectsThis had to be moved from the checker class to the visitor class, it's now a responsibility of each caller of visitLambdaExpr to check whether a given lambda captures "this" without a "protectedThis" or not. Finally, this PR improves the code to recognize "protectedThis" pattern by allowing more nested CXXBindTemporaryExpr, CXXOperatorCallExpr, and UnaryOperator expressions.

@pranavk

…4834) Following up llvm#72078, on x86-64 this allows a global to be considered small or large regardless of the code model. For example, x86-64's medium code model by default classifies globals as small or large depending on their size relative to -mlarge-data-threshold. GPU compilations compile the same TU for both the host and device, but only codegen the host or device portions of it depending on attributes. However, we still Sema the TU, and will warn on an unknown attribute for the device compilation since this attribute is target-specific. Since they're intended for the host, accept but ignore this attribute for device compilations where the host is either unknown or known to support the attribute. Co-authored-by: @pranavk

…ows spurious ref edges between new functions." (llvm#127285) Reverts llvm#116285 Breaks expensive checks build, e.g. https://lab.llvm.org/buildbot/#/builders/16/builds/13821

…127278) This uses the new merge_yaml_files feature in hdrgen to share the source of truth for the malloc suite of functions declared in both stdlib.h and in malloc.h (without either header including the other). It also modernizes the malloc.yaml definition a bit, including dropping the custom template malloc.h.def file in favor of using the explicit macros list to generate the includes.

…llvm#127108) If we have a shuffle which repeats the same pattern of elements, all of which come from the first register in the source register group, we can lower this to a single vrgather at m1 to perform the element rearrangement, and reuse that for each register in the result vector register group.

When the return type's rendering already doesn't end with an identifier character, such as when it's `T *`, then idiomatic syntax does not include a space before the `(` and arguments.

Support relaxation optimization for two types of code sequences. ``` From: pcalau12i $a0, %pc_hi20(sym) R_LARCH_PCALA_HI20, R_LARCH_RELAX addi.w/d $a0, $a0, %pc_lo12(sym) R_LARCH_PCALA_LO12, R_LARCH_RELAX To: pcaddi $a0, %pc_lo12(sym) R_LARCH_PCREL20_S2 From: pcalau12i $a0, %got_pc_hi20(sym_got) R_LARCH_GOT_PC_HI20, R_LARCH_RELAX ld.w/d $a0, $a0, %got_pc_lo12(sym_got) R_LARCH_GOT_PC_LO12, R_LARCH_RELAX To: pcaddi $a0, %got_pc_hi20(sym_got) R_LARCH_PCREL20_S2 ``` Others: - `loongarch-relax-pc-hi20-lo12-got-symbols.s` is inspired by `aarch64-adrp-ldr-got-symbols.s`. Co-authored-by: Xin Wang [[email protected]](mailto:[email protected])

For the purpose of determining triviality, ignore all attributes.

…lvm#126999) Adds a macro definition `MLIR_USE_FALLBACK_TYPE_IDS`. When this is defined, the `MLIR_{DECLARE,DEFINE}_EXPLICIT_TYPE_ID` functions explicitly fall back to string comparison. This is useful for complex shared library setups where it may be difficult to agree on a source of truth for specific type ID resolution. As long as there is a single resolution for `registerImplicitTypeID`, all type IDs can be reference a shared registration. This way types which are logically shared across multiple DSOs can have the same type ID, even if their definitions are duplicated.

…per bound in certain cases (llvm#127192) Fix `FlatLinearValueConstraints::getSliceBounds` for missing checks on no lower/upper bound bound. Obvious bug. Fixes: llvm#119525 Fixes: llvm#108374

…nfo (NFC)" (llvm#127277) Originally landed in llvm#126800 This version fixes a typo in NVPTXAsmPrinter::emitFunctionParamList where .surfref was erroneously replaced with .samplerref.

…ndTrunc. (llvm#127258) Put the bitcast before the insert_subvector. It's more likely the insert subvector can be combined with other nodes. The insert_subvector is only needed sometimes, and I'm considering reusing this function which might require pulling the insert_subvector out.

This patch makes it so that the metrics container counts the number of in progress and queued jobs at the job level rather than at the workflow level. This helps us distinguish windows versus linux load and also lets us filter out the MacOS jobs that only run in the release branch. Reviewers: Keenuts, lnihlen Reviewed By: lnihlen Pull Request: llvm#127274

This patch removes an extra heartbeat metric in the metrics python file. Before it was performed twice, once in the main function, and once in the get_sampled_workflow_metrics function. We only need one to keep everything happy, and I've chosen to keep the one in get_sampled_workflow_metrics as it seems a more appropriate place to keep it. Reviewers: Keenuts, lnihlen Reviewed By: lnihlen Pull Request: llvm#127275

The method was meant to be overriden by subclasses only. It should not be called directly by users

Currently the metrics container is crashing reasonably often with incomplete read/connection broken errors. Try moving the creation of the Github Object into the main loop to see if recreating the object that maybe handles some connection state fixes the issue. Reviewers: Keenuts, lnihlen Reviewed By: lnihlen Pull Request: llvm#127276

This patch makes it so that skipped steps do not cause a job to be considered failed. The windows premerge jobs currently skip the build/test step if there are no projects to build/test. These show up as failures in the dashboard even though everything executed perfectly fine. Reviewers: lnihlen, Keenuts Reviewed By: lnihlen Pull Request: llvm#127279

…getSingleShuffleSrc. (llvm#127250) I think this dates to a time when we used to use a type twice as large as necessary for the input to the vnsrl. This was changed in llvm#118509 when factor 4 and 8 were added. The existing test for this regresses because it uses a lot of undef elements and we previously figured out we could reduce its size and then try the vnsrl again. We now match it before we try to reduce the width so we miss this opportunity. I've added a second test that doesn't have any undef elements in the first half. Prior to this patch we used a vcompress lowering instead of vnsrl.

… in MinGW executables" (llvm#127297) Reverts llvm#107375 This was causing a build bot failure (https://lab.llvm.org/buildbot/#/builders/201/builds/2954) and also breaks building with VS2019. See llvm#107375 (comment) for details.

…sion utils (llvm#127164) Fix/improve debug messages and API signatures for affine analysis/fusion utils. Move some warnings under LLVM_DEBUG. These weren't meant to be exposed during compilation. Add dump pretty methods for FlatLinearConstraints. NFC.

chsigg and others added 30 commits February 13, 2025 08:08

[llvm][bazel] Adjust to HAVE_SYS_AUXV_H > HAVE_GETAUXVAL in llvm@89d636b

ec056f5

[RISCV][NFC] Move GenericModel to standalone file (llvm#127003)

9cc8442

And fix some typos in comments. In the future, we may add more scheduling info to GenericModel.

[mlir][affine]make affine-loop-unroll a FunctionOpInterface pass. (ll…

a472147

…vm#126475) [mlir][affine]make affine-loop-unroll a FunctionOpInterface pass Make `affine-loop-unroll` a `FunctionOpInterface` pass.Now unroll can be done on gpu.func.

[MLIR][LLVMIR] Always use TargetFolder in IRBuilder (llvm#126929)

75c356c

This is a followup to llvm#126745, generalizing it to always use TargetFolder, including inside function bodies. This avoids generating non-canonical constant expressions that can be folded away.

[flang] Update f128 tests

3bf4257

Update test that require flang-supports-f128-math after llvm#126929.

[MISched][NFC] Remove unused heuristic NextDefUse from enum (llvm#125879

df62441

) Heuristic was removed in 46533e6 due to being ineffective.

[mlir][bazel] Port a472147

cd21e0f

[LLVM][AArch64] Remove aliases of LSUI instructions (llvm#126072)

d44d806

Removes MnemonicAliases added for instructions available with the LSUI feature (e.g. CAS -> CAST) which are not equivalent. The aliases stt[add|clr|set]a & stt[add|clr|set]al are also removed.

[mlir][NFC] Add missing ) in doc for --mlir-print-local-scope (ll…

4ac1c58

…vm#125721) Signed-off-by: Victor Guerra <[email protected]>

[MLIR][LLVMIR] Fix use-after-move

9d92bea

Fix an issue introduced in llvm#126929: The LLVM module is moved into the ModuleTranslator, so query the DataLayout from there.

TableGen: Add missing consts to CodeGenSubRegIndex

4889777

[analyzer] Disable a flaky test while triaging why its flaky (llvm#12…

b88c5d6

…7034) I had previous attempts for fixing this flaky test. Let's admit I failed so far, and disable this until we have a permanent fix. See the discussion at: llvm#126913 (comment)

[clang] Fix preprocessor output from #embed (llvm#126742)

3be48a6

When bytes with negative signed char values appear in the data, make sure to use raw bytes from the data string when preprocessing, not char values. Fixes llvm#102798

AMDGPU: Add more tests for peephole opt subregister composing

6c5a008

[LoopVectorizer][AArch64] Add support for partial reduce subtraction (l…

9c89faa

…lvm#123636)

[AMDGPU] Add a regression test for -mattr=dumpcode (llvm#116982)

0b0f3da

AMDGPU: Add baseline test for treating v_pk_mov_b32 like reg_sequence (…

eef0205

…llvm#125656)

teresajohnson and others added 30 commits February 14, 2025 14:12

[MemProf] Fix recursion tests (llvm#127270)

496fec5

The implicit-check-not had a typo which meant it didn't fail as expected when I tested better recursion handling. Fix that here (no change for current head).

[alpha.webkit.UncountedCallArgsChecker] Allow ArrayInitLoopExpr and O…

9106ee2

…paqueValueExpr in trivial expressions (llvm#127182) Allow VisitArrayInitLoopExpr, VisitArrayInitIndexExpr, and VisitOpaqueValueExpr in trivial functions and statements.

[Github][CI] Fix Typo in MacOS Job Name

50b1763

This patch changes the name of the MacOS premerge job from permerge-checks-macos to (the presumably correct) premerge-checks-macos.

[CIR] Fix extra ; warning, and replace new with emplaceBlock (NFC) (

d9b55b7

llvm#127207)

[HLSL] Fix Root signature test error (llvm#127261)

77ddffc

Fixing error detected in build bot in file `RootSignature-MultipleEntryFunctions.ll` closes: [127260](llvm#127260) --------- Co-authored-by: joaosaffran <[email protected]>

[libc] Implement strftime (llvm#122556)

398f865

Implements the posix-specified strftime conversions for the default locale, along with comprehensive unit tests. This reuses a lot of design from printf, as well as the printf writer. Roughly based on llvm#111305, but with major rewrites.

[libc] Fix implicit cast warning in strftime (llvm#127282)

60af835

Forgot to change a size_t to an int, which caused warnings on gcc but not clang for some reason. Regardless, this patch fixes the issue.

Revert "[Coroutines][LazyCallGraph] addSplitRefRecursiveFunctions all…

caaa288

…ows spurious ref edges between new functions." (llvm#127285) Reverts llvm#116285 Breaks expensive checks build, e.g. https://lab.llvm.org/buildbot/#/builders/16/builds/13821

[libc] Elide extra space in hdrgen function declarations (llvm#127287)

68a82a2

When the return type's rendering already doesn't end with an identifier character, such as when it's `T *`, then idiomatic syntax does not include a space before the `(` and arguments.

[WebKit Checkers] Treat attributes as trivial (llvm#127289)

e9fb239

For the purpose of determining triviality, ignore all attributes.

[MLIR][Affine] Fix getSliceBounds for missing handling of no lower/up…

9fddcea

…per bound in certain cases (llvm#127192) Fix `FlatLinearValueConstraints::getSliceBounds` for missing checks on no lower/upper bound bound. Obvious bug. Fixes: llvm#119525 Fixes: llvm#108374

Reland "[NVPTX] Cleanup/Refactoring in NVPTX AsmPrinter and RegisterI…

34cf04b

…nfo (NFC)" (llvm#127277) Originally landed in llvm#126800 This version fixes a typo in NVPTXAsmPrinter::emitFunctionParamList where .surfref was erroneously replaced with .samplerref.

Make llvm::telemetry::Manager::preDispatch protected. (llvm#127114)

f7a2d70

The method was meant to be overriden by subclasses only. It should not be called directly by users

[AutoBump] Merge with 61ad087 (Feb 14)

2b453ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AutoBump] Merge with 61ad0879 (Feb 14) (46) #590

[AutoBump] Merge with 61ad0879 (Feb 14) (46) #590

Uh oh!

jorickert commented Jun 17, 2025

Uh oh!

Uh oh!

[AutoBump] Merge with 61ad0879 (Feb 14) (46) #590

Are you sure you want to change the base?

[AutoBump] Merge with 61ad0879 (Feb 14) (46) #590

Uh oh!

Conversation

jorickert commented Jun 17, 2025

Uh oh!

Uh oh!