-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] TVM/LLVM sets RISC-V VLEN to 128 bits instead of 256 on Banana Pi K1 #17625
Comments
Hi @JieGH Regarding arch specific technical details {CPU: type, variations, features} TVM relay on LLVM itself (it queries the very library). AFAIK, when its about RISCV land, there are limited infos within LLVM available, limited to interest, maintainance, with things mostly coming from SiFive folks. Let me look and tackle this problem and come up with best way for TVM, here are some possibilities on my mind:
If the LLVM query don't work (cpu variation is not enlisted in LLVM) let's further override within TVM:
I believe {0 + 1, 2} can be implemented all together to offer users multiple ways of control. |
Hi @cbalint13, Thanks for the advice. I now have a method for choosing VLEN, which is using an additional flag: By specifying zvl256b flags, it means enable 'Zvl' (Minimum Vector Length) 256. This indeed has an impact on the execution's performance.
Any comments on this? Thanks. |
Hi @JieGH ,
Yes, another way to tell LLVM the VLEN is via the canonical flags, but we also need TVM itself to be aware of this.
You have an older LLVM, and it does not know about
The flags (older LLVM) would be: llvm -device=riscv_cpu -vector-width=256 -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+64bit,+a,+c,+d,+f,+m,+v (orcjit is already default, vector-with informs booth TVM and LLVM).
Performance also depends on how LLVM optimizes things out, TVM have no highly-specialized optimizations for RISCV. TVM emmits candidates/iterations as intermediate proposals (in auto-tunnig flow) and forwards to LLVM, while electing only the best performing ones. Not sure if you are also tring to tune your model/function, but without a tuning process TVM likely emits a subperforming variant, even for a simple matmul operation, there should be a warn on this:
The work done in #17631 only informs TVM about VLEN intentions from LLVM side. |
TVM/LLVM sets RISC-V VLEN to 128 bits instead of 256 on Banana Pi K1
Expected Behavior
I have a vecter-based TVM design that shows poor performance against C implementation, 1/4 of performance approx.
The LLVM warns VLEN to be set as default 128 bits, but the Banana Pi should uses 256 bits instead.
TVM and LLVM should correctly detect and utilize a vector length (VLEN) of 256 bits on the Banana Pi K1 board.
Actual Behavior
VLEN=256
when execute
-- When checking the asm code generated from TVM flow, when querying via
csrr a4, vlenb
in assembly. Does the ASM code show e32, m2, indicating the vector length is 128 bit.src/target/llvm/codegen_llvm.cc:185: Warning: Set native vector bits to be 128 for riscv64
-mllvm -riscv-v-vector-bits-min=256 -mllvm -riscv-v-vector-bits-max=256
do not take effect, llvm says it illegal flags, then it recommendValueError: Error when parsing target["mllvm"]: Cannot recognize 'mllvm'. Candidates are: cl-opt, opt-level, fast-math-ninf, fast-math-arcp, fast-math-nnan, fast-math, fast-math-contract, num-cores, device, libs, tag, mtriple, host, from_device, target_device_type, fast-math-reassoc, keys, mattr, fast-math-nsz, model, mabi, mcpu, jit, mfloat-abi. Target creation from string failed: llvm -jit=orcjit -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+m,+a,+f,+d,+zfh,+v,+c -mllvm -riscv-v-vector-bits-min=256 -mllvm -riscv-v-vector-bits-max=256
Could anyone advice me how can I set the correct vector length to 256 bits? And from your @cbalint13 knowledge of RISCV, could you provide some advice? Thanks all.
Environment
llvm -jit=orcjit -mtriple=riscv64-linux-gnu -mcpu=generic-rv64 -mattr=+m,+a,+f,+d,+zfh,+v,+c
The text was updated successfully, but these errors were encountered: