Skip to content

Conversation

@ValentijnvdBeek
Copy link
Contributor

Updates the C header to use the new intrinsics based on Shuffle Vector. A lot of these cases are optimized by #41, however there are a couple of variants that still generate a very slow implementation. Specifically, extract and insert are covered for a normative case, but not for all possible variants. Especially, the intrinsics generate cases for 256_1024 and 128_512, which have more cases than the combiner can deal with.

It also adds a test for C intrinsics, however this it does this at the LLVM IR level. The code balks at the idea of generating to assembly, so maybe there is an intermediate that we can target instead.

This implements the simple legalization that lowers G_SHUFFLE_VECTOR into extracts
of the elements based on the mask and then combining them using a G_BUILD_VECTOR.
Our architecture has a VSHUFFLE instruction which could be used to implement some
patterns more efficiently.
… CONCAT_VECTOR

We check for iterative shift masks which corresponds to the
CONCAT_VECTOR instruction.
@ValentijnvdBeek ValentijnvdBeek force-pushed the vvandebe.shufflevector.pattern.optimization branch from 07244e7 to 4836f6a Compare August 1, 2024 16:45
@ValentijnvdBeek ValentijnvdBeek force-pushed the vvandebe.shufflevector.add.intrinsics branch from 3a34679 to d1411d4 Compare August 2, 2024 11:46
@ValentijnvdBeek ValentijnvdBeek force-pushed the vvandebe.shufflevector.pattern.optimization branch 5 times, most recently from b49d34c to f855c29 Compare August 12, 2024 12:27
@ValentijnvdBeek ValentijnvdBeek force-pushed the vvandebe.shufflevector.pattern.optimization branch 2 times, most recently from 4d6af83 to 13cc82a Compare August 15, 2024 08:52
@ValentijnvdBeek ValentijnvdBeek force-pushed the vvandebe.shufflevector.pattern.optimization branch 2 times, most recently from c38562a to ebe6489 Compare September 23, 2024 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:aie2 vectorization Support for vector instructions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants