Skip to content

[AutoBump] Merge with 43d71baa (Feb 20) (66) #610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: bump_to_11468c3b
Choose a base branch
from

Conversation

jorickert
Copy link

No description provided.

Renaud-K and others added 30 commits February 20, 2025 11:50
Clarify behavior of the function when the builtin is also supported on
the main target.
Based on feedback from llvm#126324

---------

Signed-off-by: Sarnie, Nick <[email protected]>
…128069)

Summary:
These are required because we don't use the `file` interface.
Summary:
This was removed but forgot to remove it in this one place. So it
errors.
Another followup fix to llvm#121215

The new cmake wouldn't define the readerat all if the target wasn't GPU
or didn't have a definition of FILE. This patch rewrites the cmake to be
more general.

As a followup, I'd like to make `use_system_file` consistent between
/test and /src. Currently in /src it includes the `COMPILE_OPTIONS` and
in /test it does not.
Allow do concurrent inside cuf kernel directive to avoid the following
Lowering error:
```
void {anonymous}::FirConverter::genFIR(const Fortran::parser::CUFKernelDoConstruct&): Assertion `bounds && "Expected bounds on the loop construct"' failed.
```

---------

Co-authored-by: Valentin Clement (バレンタイン クレメン) <[email protected]>
In most circumstances BSS segments are not required in the output binary
but combineOutputSegments was erroneously including them. This meant
that PIC binaries were including the BSS data as zero in the binary.

Fixes: emscripten-core/emscripten#23683
This option prints the name of the DLL that gets imported, when linking
against an import library.

This is implemented using the same strategy as GNU dlltool does; looking
for the contents of .idata$6 or .idata$7 chunks. The right section name
to check for is chosen by identifying whether the library is GNU or LLVM
style. In the case of GNU import libraries, the DLL name is in an
.idata$7 chunk. However there are also other chunks with that section
name (for entries for the IAT or ILT); identify these by looking for
whether a chunk contains relocations.

Alternatively, one could also just look for .idata$2 chunks, look for
relocations at the right offset, and locate data at the symbol that the
relocation points at (which may be in the same or in another object
file).
Remove load and store instructions which do not include an immediate,
and just use the immediate variants in all cases. These variants will be
emitted exactly the same when the immediate offset is 0. Removing the
non-immediate versions allows for the removal of a lot of code and would
make any MachineIR passes simpler.
This fixes a bug where `SchedBundle::eraseFromBundle()` would not erase all
matching nodes but just the first one.
When assigning a bundle to a DAG Node that is already assigned to a
SchedBundle we need to remove the node from the old bundle.
This patch adds a lit test for mul op of scalar input1 and input2 but
rank-1 shift operand to make sure output is still scalar

Signed-off-by: Tai Ly <[email protected]>
…llvm#128050)

This patch implements the check for not allowing re-scheduling of
instructions that have already been scheduled in a scheduling bundle.
Rescheduling should only happen if the instructions were temporarily
scheduled in singleton bundles during a previous call to
`trySchedule()`.
…m#128075)

This is a follow-up to llvm#127844. That PR got vectors of arbitrary rank
working, but I hadn't thought about the rank-0 case.

Signed-off-by: Benoit Jacob <[email protected]>
…o() (llvm#128070)

Sometimes we need to know the size of a symbol besides its address, so
maybe we can start using the existing `BOLTLinker::lookupSymbolInfo()`
(that returns symbol address and size) and remove
`BOLTLinker::lookupSymbol()` (that only returns symbol address). And for
both we need to check return value as it is wrapped in `std::optional<>`,
which makes the difference even smaller.
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.

Fixes llvm#123801
…lvm#127464)

The Src0 operand width higher that 32-bits of cvt_scale opcodes
operating on FP6/BF6/FP4 need to be restricted to take only VGPRs.
…8112)

Reverts llvm#125807

Reverting this change because of failing tests.
Close llvm#127963

The root cause of the problem seems to be that we didn't realize it
simply.
Dummy scoping operations are generated to keep track of scopes for
purpose of Fortran level analyses like Alias Analysis. For codegen, the
scoping info is converted to a fir.undef during pre-codegen rewrite.
Then during declare lowering, this info is no longer used - but it is
still translated to llvm.undef. I cleaned up so it is simply erased. The
generated LLVM should now no longer have a stray undef which looks off
when trying to make sense of the IR.

Co-authored-by: Razvan Lupusoru <[email protected]>
Fixes: llvm#123903

Reviewed By: DavidSpickett, SixWeining

Pull Request: llvm#124059
This adds a tosa-infer-shapes test for scalar mul op

Signed-off-by: Tai Ly <[email protected]>
The program will exit the outer loop directly if inner loop ends, so the
outer do {} while() is redundant.
Unfortunately we only have the vector versions of v2f16 minimum3
and maximum. Widen to v2f16 so we can lower as minimum333(x, y, y).
vzakhari and others added 8 commits February 20, 2025 21:18
When CHECK-SAME checks are split across multiple lines,
the '.*' regexp for the SSA variable name may cause problems, e.g.:
```
// CHECK_LABEL: func.func @whatever(
// CHECK-SAME: %[[VAL_0:.*]]: i32,
// CHECK-SAME: %[[VAL_1:.*]]: i32,
// CHECK-SAME: %[[VAL_2:.*]]: i64)
```

This will not work for `func.func @whatever(%0: i32, %1: i32, %2: i64)`,
because VAL_0 will match to `0: i32, %1`.
We had been abusing the setOverrideObjectFlagsWithResponsibilityFlags method to
do this. Handling it explicitly ensures that flags are only modified on the
intended files, and not accedintally modified elsewhere.
This testcase depends on stable output, which isn't guaranteed when
concurrent linking is enabled (the default).
…#127984)

Making all reg alloc classes have an `::Option` class makes things nicer
to construct them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.