forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 5
[AutoBump] Merge with f7d03707 (Feb 18) (52) #596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jorickert
wants to merge
65
commits into
bump_to_4cc7d60f
Choose a base branch
from
bump_to_f7d03707
base: bump_to_4cc7d60f
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…7627) The fixme comment turned out to be true.
…es (llvm#127625) Handle different src/mask operand ordering of X86ISD::VPERMV nodes
This commit is a broad update across libclc to use the CLC conversion builtins in CLC functions, even those with a '__clc' prefix in the generic folder. This better prepares them for an official move to the CLC library in time. The CLC conversion builtins have an additional benefit in that they support scalars, unlike the __builtin_convertvector builtin which we were using previously. This allows us to simplify some shared definitions. There is one change to the IR, in the scalar upsample(char, uchar) builtin. It now sign-extends the first argument to i16, where before it zero-extended it. This appears to be correct, and matches the vector behaviour.
…orceTailAgnostic. NFC (llvm#127575) Add a policy operand to set the tail agnostic policy instead of using ForceTailAgnostic. The masked to unmasked transforms had to be updated to drop the policy operand when converting to unmasked.
llvm#124432) Use HCFGBuilder to build an initial VPlan 0, which wraps all input instructions in VPInstructions and update tryToBuildVPlanWithVPRecipes to replace the VPInstructions with widened recipes. At the moment, widened recipes are created based on the underlying instruction of the VPInstruction. Masks are also still created based on the input IR basic blocks and the loop CFG is flattened in the main loop processing the VPInstructions. This patch also incldues support for Switch instructions in HCFGBuilder using just a VPInstruction with Instruction::Switch opcode. There are multiple follow-ups planned: * Perform predication on the VPlan directly, * Unify code constructing VPlan 0 to be shared by both inner and outer loop code paths. * Construct VPlan 0 once, clone subsequent ones for VFs PR: llvm#124432
Details: Make LLDB's TelemetryManager a "plugin" so that vendor can supply appropriate implementation. The rest of LLDB code will simply call `TelemetryManager::getInstance` --------- Co-authored-by: Pavel Labath <[email protected]>
This patch adds the erfc op to the math dialect. It also does lowering of the math.erfc op to libm calls. There is also a f32 polynomial approximation for the function based on https://stackoverflow.com/questions/35966695/vectorizable-implementation-of-complementary-error-function-erfcf This is in turn based on M. M. Shepherd and J. G. Laframboise, "Chebyshev Approximation of (1+2x)exp(x^2)erfc x in 0 <= x < INF", Mathematics of Computation, Vol. 36, No. 153, January 1981, pp. 249-253. The code has a ULP error less than 3, which was tested, and MLIR test values were verified against the C implementation.
…eshold Need to check if the tree is too small before attempting to vectorize the tree to prevent hanging on small trees with phis only.
…m#114500) Implement new pseudos with the suffix _t16 for FLAT_LOAD which have VGPR_16 as the load dst. Lower the pseudos to the existing real instructions with VGPR_32 src or dst (which makes them consistent with the hardware encoding). This patch reduces VGPR usage by making hi halves of VGPRs available for other values. There are more 8/16 bits ld/st instructions to be supported in the up-coming patches
) This commit adds "instantiated_from" to the AST dump for EnumDecl, improving consistency with CXXRecordDecl and FunctionDecl, which also include this information. To achieve this, TextNodeDumper::VisitEnumDecl is updated with analogous lines found in TextNodeDumper::VisitFunctionDecl and TextNodeDumper::VisitCXXRecordDecl.
This patch fixes: third-party/unittest/googletest/include/gtest/gtest.h:1379:11: error: comparison of integers of different signs: 'const int' and 'const unsigned long' [-Werror,-Wsign-compare]
This patch adds the OMP.DeclareMapperOp to MLIR. The HLFIR/FIR lowering for Declare Mapper is available here llvm#117046.
…ion (llvm#114500)" This reverts commit f7a5f06. Fails to build with: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:126:37: error: no member named 'OPERAND_LAST' in 'llvm::AMDGPU::OpName' 126 | uint16_t OpName = AMDGPU::OpName::OPERAND_LAST;
Need to check if the block is reachable before comparing phis from it to avoid compiler crash when requesting node. Fixes report in llvm#110529 (comment)
…n RISCVTargetParser.h. NFC (llvm#127585) The VLMUL and policy enums originally lived in RISCVBaseInfo.h in the backend which is where everything else in the RISCVII namespace is defined. RISCVTargetParser.h is used by much more of the compiler and it doesn't really make sense to have 2 different namespaces exposed. These enums are both associated with VTYPE so using the RISCVVType namespace seems like a good home for them.
…ve (llvm#117046) This patch adds HLFIR/FIR lowering support for OpenMP Declare Mapper directive. Depends on llvm#117045.
Unifies imports to use a single insertion point, `globalInsertionOp`, for global values. Refactors insertion point setup into `setGlobalInsertionPoint`, which sets insertion point after `globalInsertionOp` or defaults to the start of the module if it is not set.
Inspired by llvm#90738 (although that is a clang codegen issue)
This patch adds the mapper field to the omp.map.info op. Depends on llvm#117046.
…clause (llvm#121001) Add Lowering support for OpenMP mapper field in mapInfoOp. NOTE: This patch only supports explicit mapper lowering. I'll add a separate PR soon which handles implicit default mapper recognition. Depends on llvm#120994.
…P DeclareMapper (llvm#121005) Add conversion support from FIR to LLVM Dialect for OMP DeclareMapper. Depends on llvm#121001
…pers (llvm#124746) This patch adds OpenMPToLLVMIRTranslation support for the OpenMP Declare Mapper directive. Since both MLIR and Clang now support custom mappers, I've changed the respective function params to no longer be optional as well. Depends on llvm#121005
The last uses of these functions were removed in: commit 58bc98c Author: Arthur Eubanks <[email protected]> Date: Fri Jul 12 10:02:50 2024 -0700
The last use was removed in: commit fa6ea7a Author: Arthur Eubanks <[email protected]> Date: Mon Mar 20 11:18:35 2023 -0700
… Pass (llvm#127646) Add cmdOption suffix consumer function in GpuModuleToBinary Pass, which can tokenize and remove a specific suffix of cmdOption.
See https://discourse.llvm.org/t/rfc-introduce-opasm-type-attr-interface-for-pretty-print-in-asmprinter/83792 for detailed introduction. This is a follow up PR of llvm#121187, by integrating OpAsmTypeInterface with AsmPrinter. There are a few conditions when OpAsmTypeInterface comes into play * There is no OpAsmOpInterface * Or OpAsmOpInterface::getAsmResultName/getBlockArgumentName does not invoke `setName` (i.e. the default impl) * All results have OpAsmTypeInterface (otherwise we can not handle result grouping behavior) Cc @River707 @jpienaar @ftynse for review.
See https://discourse.llvm.org/t/rfc-introduce-opasm-type-attr-interface-for-pretty-print-in-asmprinter/83792 for detailed introduction. This PR adds * Definition of `OpAsmAttrInterface` * Integration of `OpAsmAttrInterface` with `AsmPrinter` In llvm#121187 (comment) I mentioned splitting them into two PRs, but I realized that a PR with only definition of `OpAsmAttrInterface` is hard to test as it requires a custom Dialect with `OpAsmDialectInterface` to hook with `AsmPrinter`, so I just put them together to have a e2e test. Cc @River707 @jpienaar @ftynse for review.
This avoids dozens of regressions in a future patch. These primarily manifested as assertions where we had copies of 64-bit registers to 32-bit registers. This is testable in principle with hand written MIR, but that's a bit too much x86 for me.
On some (Linux) systems /etc/localtime is not a symlink to the time zone, but contains a copy of the binary time zone file. In these case there usually is a file named /etc/timezone which contains the text for the current time zone name. Instead of throwing when /etc/localtime does not exist or is not a symlink use this fallback. Fixes: llvm#105634 --------- Co-authored-by: Louis Dionne <[email protected]>
See https://discourse.llvm.org/t/rfc-introduce-opasm-type-attr-interface-for-pretty-print-in-asmprinter/83792 for detailed introduction. This PR should be rebased once llvm#124721 is merged. This PR adds * Definition of `getAlias` for `OpAsmTypeInterface` * Integration of `OpAsmTypeInterface` with `AsmPrinter` alias handling part This is partly in response to https://github.com/llvm/llvm-project/pull/124721/files#r1940399862 Cc @River707 for review.
…ame as canonicalizing (llvm#127670) Fixes this crash: https://hlsl.godbolt.org/z/9aP74s4bP Which happens because the de-sugared type is the same as the canonicalized type. Check if the de-sugared type is canonical before getting the ArrayParameterType of the canonical type. Add AST test to ensure crash doesn't happen.
…lvm#127361) This patch implements dependency maintenance upon receiveing the notification that an instruction gets deleted.
…7132) This patch moves the seed collection logic from the BottomUpVec pass into a new Sandbox IR Function pass. The new "seed-collection" pass collects the seeds, builds a region and runs the region pass pipeline.
llvm#126329) Fixes llvm#126162 I check locally that it works without warning for: - neither options are defined - both defined to the same value And I checked that it warns if: - only one is defined - they defined to different values
) This PR adds a few more tests to validate some error scenarios of root signature metadata representation. Closes: llvm#127280 --------- Co-authored-by: joaosaffran <[email protected]>
Move LinalgInterfaces.cpp from LinalgInterfaces to LinalgDialect target. This allows TensorDialect to use header-only RelayoutOpInterface without introducing a hidden dependency on LinalgDialect (producing an `no-allow-shlib-undefined` error if a target depends on TensorDialect but not LinalgDialect). Also reverts llvm@d64f177 because it's no longer needed.
…calls This can happen when using a LTO build of compiler-rt for ARM and the program uses 64-bit division. The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written in assembly and calls the C function __divmoddi4, which works fine in non-LTO links. However, when building with LTO the call inside __aeabi_ldivmod is replaced with a jump to address zero, which then crashes the program. Building with -pie generates an error instead of a jump to address zero, and surprisingly just declaring the __aeabi_ldivmod function (but not calling it) in the input IR also avoids this issue. Reported as llvm#127284 Co-authored-by: Fangrui Song <[email protected]> Reviewed By: MaskRay Pull Request: llvm#127286
…7510) Corresponding Github issues will be created shortly.
) Attempting to pass a `ptr addrspace(7)` to functions that take `ptr` arguments produces undesirable `addrspacecast(addrspacecast(p8 x to p7) to p0) => addrspacecast(p8 x to p0)` folds. This results in illegal GEP operations on buffer resources, which can't be GEP'd. (However, note that, while unimplemneted, addressspacecast from ptr addrspace(7) to ptr is legal - it's just an effective address computation) To resolve this problem, and thus prevent illegal `getelementptr T, ptr addrspace(8) %x, ...` s from being produces, this commit extends amdgcn.make.buffer.rsrc to also be variadic in its result type, auto-upgrading old manglings. The logic for handling a make.buffer.rsrc in instruction selection remains untouched and expects the output type to be a ptr addrspace(8), as does the Clang lowering for its builtin (the pointer-to-pointer version might want a different name in clang). LowerBufferFatPointers has been updated to lower amdgcn.make.buffer.rsrc.p7.p* to amdgcn.make.buffer.rsrc.p8.p* . This'll also make exposing buffer fat pointers in Clang easier, since you don't have to cast between a `__amdgcn_rsrc_t` and a pointer.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.