-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIE2] Add AIE2 vector intrinsics #99
Open
ValentijnvdBeek
wants to merge
10
commits into
vvandebe.shufflevector.pattern.optimization
Choose a base branch
from
vvandebe.shufflevector.add.intrinsics
base: vvandebe.shufflevector.pattern.optimization
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[AIE2] Add AIE2 vector intrinsics #99
ValentijnvdBeek
wants to merge
10
commits into
vvandebe.shufflevector.pattern.optimization
from
vvandebe.shufflevector.add.intrinsics
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e160d1c
to
07244e7
Compare
This implements the simple legalization that lowers G_SHUFFLE_VECTOR into extracts of the elements based on the mask and then combining them using a G_BUILD_VECTOR. Our architecture has a VSHUFFLE instruction which could be used to implement some patterns more efficiently.
… CONCAT_VECTOR We check for iterative shift masks which corresponds to the CONCAT_VECTOR instruction.
…hunks of a vector
… of two vectors together
07244e7
to
4836f6a
Compare
3a34679
to
d1411d4
Compare
b49d34c
to
f855c29
Compare
4d6af83
to
13cc82a
Compare
c38562a
to
ebe6489
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Updates the C header to use the new intrinsics based on Shuffle Vector. A lot of these cases are optimized by #41, however there are a couple of variants that still generate a very slow implementation. Specifically,
extract
andinsert
are covered for a normative case, but not for all possible variants. Especially, the intrinsics generate cases for256_1024
and128_512
, which have more cases than the combiner can deal with.It also adds a test for C intrinsics, however this it does this at the LLVM IR level. The code balks at the idea of generating to assembly, so maybe there is an intermediate that we can target instead.