[Bug] gemma-2b for Android. OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE #1844

qc903113684 · 2024-02-27T08:21:09Z

🐛 Bug

Compile Gemma-2b for Android in q4f16_0. Load model successful, but chat get error: OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE Stack trace: File "/home/chaoqin/mlcllm/3rdpaty/tvm/scr/runtime/opencl/opencl_module.cc", line 90

To Reproduce

Steps to reproduce the behavior:

compile gemma-2b by q4f16_0 target android
compile android jar
build app by Android studio

Expected behavior

Environment

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA):Android
Operating system (e.g. Ubuntu/Windows/MacOS/...):Ubuntu
Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): Android Qualcomm Snapdragon 865
How you installed MLC-LLM (conda, source): conda
How you installed TVM-Unity (pip, source): pip
Python version (e.g. 3.10): 3.10
GPU driver version (if applicable): 535.86.05
CUDA/cuDNN version (if applicable): CUDA 11.8
TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
USE_NVTX: OFF
USE_GTEST: AUTO
SUMMARIZE: OFF
USE_IOS_RPC: OFF
USE_MSC: OFF
USE_ETHOSU:
CUDA_VERSION: NOT-FOUND
USE_LIBBACKTRACE: AUTO
DLPACK_PATH: 3rdparty/dlpack/include
USE_TENSORRT_CODEGEN: OFF
USE_THRUST: OFF
USE_TARGET_ONNX: OFF
USE_AOT_EXECUTOR: ON
BUILD_DUMMY_LIBTVM: OFF
USE_CUDNN: OFF
USE_TENSORRT_RUNTIME: OFF
USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF
USE_CCACHE: AUTO
USE_ARM_COMPUTE_LIB: OFF
USE_CPP_RTVM:
USE_OPENCL_GTEST: /path/to/opencl/gtest
USE_MKL: OFF
USE_PT_TVMDSOOP: OFF
MLIR_VERSION: NOT-FOUND
USE_CLML: OFF
USE_STACKVM_RUNTIME: OFF
USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF
ROCM_PATH: /opt/rocm
USE_DNNL: OFF
USE_VITIS_AI: OFF
USE_MLIR: OFF
USE_RCCL: OFF
USE_LLVM: llvm-config --ignore-libllvm --link-static
USE_VERILATOR: OFF
USE_TF_TVMDSOOP: OFF
USE_THREADS: ON
USE_MSVC_MT: OFF
BACKTRACE_ON_SEGFAULT: OFF
USE_GRAPH_EXECUTOR: ON
USE_NCCL: OFF
USE_ROCBLAS: OFF
GIT_COMMIT_HASH: 79991133c17bb8685185e1f03cc2f688ea37c974
USE_VULKAN: ON
USE_RUST_EXT: OFF
USE_CUTLASS: OFF
USE_CPP_RPC: OFF
USE_HEXAGON: OFF
USE_CUSTOM_LOGGING: OFF
USE_UMA: OFF
USE_FALLBACK_STL_MAP: OFF
USE_SORT: ON
USE_RTTI: ON
GIT_COMMIT_TIME: 2024-02-21 22:31:30 -0500
USE_HEXAGON_SDK: /path/to/sdk
USE_BLAS: none
USE_ETHOSN: OFF
USE_LIBTORCH: OFF
USE_RANDOM: ON
USE_CUDA: OFF
USE_COREML: OFF
USE_AMX: OFF
BUILD_STATIC_RUNTIME: OFF
USE_CMSISNN: OFF
USE_KHRONOS_SPIRV: OFF
USE_CLML_GRAPH_EXECUTOR: OFF
USE_TFLITE: OFF
USE_HEXAGON_GTEST: /path/to/hexagon/gtest
PICOJSON_PATH: 3rdparty/picojson
USE_OPENCL_ENABLE_HOST_PTR: OFF
INSTALL_DEV: OFF
USE_PROFILER: ON
USE_NNPACK: OFF
LLVM_VERSION: 15.0.7
USE_MRVL: OFF
USE_OPENCL: OFF
COMPILER_RT_PATH: 3rdparty/compiler-rt
RANG_PATH: 3rdparty/rang/include
USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF
USE_OPENMP: OFF
USE_BNNS: OFF
USE_CUBLAS: OFF
USE_METAL: OFF
USE_MICRO_STANDALONE_RUNTIME: OFF
USE_HEXAGON_EXTERNAL_LIBS: OFF
USE_ALTERNATIVE_LINKER: AUTO
USE_BYODT_POSIT: OFF
USE_HEXAGON_RPC: OFF
USE_MICRO: OFF
DMLC_PATH: 3rdparty/dmlc-core/include
INDEX_DEFAULT_I64: ON
USE_RELAY_DEBUG: OFF
USE_RPC: ON
USE_TENSORFLOW_PATH: none
TVM_CLML_VERSION:
USE_MIOPEN: OFF
USE_ROCM: OFF
USE_PAPI: OFF
USE_CURAND: OFF
TVM_CXX_COMPILER_PATH: /opt/rh/gcc-toolset-11/root/usr/bin/c++
HIDE_PRIVATE_SYMBOLS: ON
Any other relevant information:

Additional context

Gemma-2b with same code and enviroment on Qualcomm 8gen2 can work successful, but snapdragon 865 chat failed.
Compiled qwen-1.8b for snadragon 865 work successful.
I think this error relative to gemma's implementation.

The text was updated successfully, but these errors were encountered:

bulutthecat · 2024-03-16T01:43:44Z

🐛 Bug

Compile Gemma-2b for Android in q4f16_0. Load model successful, but chat get error: OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE Stack trace: File "/home/chaoqin/mlcllm/3rdpaty/tvm/scr/runtime/opencl/opencl_module.cc", line 90

To Reproduce

Steps to reproduce the behavior:

compile gemma-2b by q4f16_0 target android

compile android jar

build app by Android studio

Expected behavior

Environment

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA):Android

Operating system (e.g. Ubuntu/Windows/MacOS/...):Ubuntu

Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): Android Qualcomm Snapdragon 865

How you installed MLC-LLM (conda, source): conda

How you installed TVM-Unity (pip, source): pip

Python version (e.g. 3.10): 3.10

GPU driver version (if applicable): 535.86.05

CUDA/cuDNN version (if applicable): CUDA 11.8

TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
USE_NVTX: OFF
USE_GTEST: AUTO
SUMMARIZE: OFF
USE_IOS_RPC: OFF
USE_MSC: OFF
USE_ETHOSU:
CUDA_VERSION: NOT-FOUND
USE_LIBBACKTRACE: AUTO
DLPACK_PATH: 3rdparty/dlpack/include
USE_TENSORRT_CODEGEN: OFF
USE_THRUST: OFF
USE_TARGET_ONNX: OFF
USE_AOT_EXECUTOR: ON
BUILD_DUMMY_LIBTVM: OFF
USE_CUDNN: OFF
USE_TENSORRT_RUNTIME: OFF
USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF
USE_CCACHE: AUTO
USE_ARM_COMPUTE_LIB: OFF
USE_CPP_RTVM:
USE_OPENCL_GTEST: /path/to/opencl/gtest
USE_MKL: OFF
USE_PT_TVMDSOOP: OFF
MLIR_VERSION: NOT-FOUND
USE_CLML: OFF
USE_STACKVM_RUNTIME: OFF
USE_GRAPH_EXECUTOR_CUDA_GRAPH: OFF
ROCM_PATH: /opt/rocm
USE_DNNL: OFF
USE_VITIS_AI: OFF
USE_MLIR: OFF
USE_RCCL: OFF
USE_LLVM: llvm-config --ignore-libllvm --link-static
USE_VERILATOR: OFF
USE_TF_TVMDSOOP: OFF
USE_THREADS: ON
USE_MSVC_MT: OFF
BACKTRACE_ON_SEGFAULT: OFF
USE_GRAPH_EXECUTOR: ON
USE_NCCL: OFF
USE_ROCBLAS: OFF
GIT_COMMIT_HASH: 79991133c17bb8685185e1f03cc2f688ea37c974
USE_VULKAN: ON
USE_RUST_EXT: OFF
USE_CUTLASS: OFF
USE_CPP_RPC: OFF
USE_HEXAGON: OFF
USE_CUSTOM_LOGGING: OFF
USE_UMA: OFF
USE_FALLBACK_STL_MAP: OFF
USE_SORT: ON
USE_RTTI: ON
GIT_COMMIT_TIME: 2024-02-21 22:31:30 -0500
USE_HEXAGON_SDK: /path/to/sdk
USE_BLAS: none
USE_ETHOSN: OFF
USE_LIBTORCH: OFF
USE_RANDOM: ON
USE_CUDA: OFF
USE_COREML: OFF
USE_AMX: OFF
BUILD_STATIC_RUNTIME: OFF
USE_CMSISNN: OFF
USE_KHRONOS_SPIRV: OFF
USE_CLML_GRAPH_EXECUTOR: OFF
USE_TFLITE: OFF
USE_HEXAGON_GTEST: /path/to/hexagon/gtest
PICOJSON_PATH: 3rdparty/picojson
USE_OPENCL_ENABLE_HOST_PTR: OFF
INSTALL_DEV: OFF
USE_PROFILER: ON
USE_NNPACK: OFF
LLVM_VERSION: 15.0.7
USE_MRVL: OFF
USE_OPENCL: OFF
COMPILER_RT_PATH: 3rdparty/compiler-rt
RANG_PATH: 3rdparty/rang/include
USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF
USE_OPENMP: OFF
USE_BNNS: OFF
USE_CUBLAS: OFF
USE_METAL: OFF
USE_MICRO_STANDALONE_RUNTIME: OFF
USE_HEXAGON_EXTERNAL_LIBS: OFF
USE_ALTERNATIVE_LINKER: AUTO
USE_BYODT_POSIT: OFF
USE_HEXAGON_RPC: OFF
USE_MICRO: OFF
DMLC_PATH: 3rdparty/dmlc-core/include
INDEX_DEFAULT_I64: ON
USE_RELAY_DEBUG: OFF
USE_RPC: ON
USE_TENSORFLOW_PATH: none
TVM_CLML_VERSION:
USE_MIOPEN: OFF
USE_ROCM: OFF
USE_PAPI: OFF
USE_CURAND: OFF
TVM_CXX_COMPILER_PATH: /opt/rh/gcc-toolset-11/root/usr/bin/c++
HIDE_PRIVATE_SYMBOLS: ON

Any other relevant information:

Additional context

Gemma-2b with same code and enviroment on Qualcomm 8gen2 can work successful, but snapdragon 865 chat failed.

Compiled qwen-1.8b for snadragon 865 work successful.
I think this error relative to gemma's implementation.

I am having this issue as well, but with all the 7B models.
it cannot possibly be a memory issue, as 12GB should be more then enough RAM for any of these models (and it not an allocation or out of range error) so I suspect it might be some form of matrix multiplication issue from whatever lib is being used (so OpenCL) where a value being returned is above its maximum allocated range of work items (basically values).
I haven't looked at opencl_module.cc yet, but my suspicion is that some dynamic allocation stuff is happening thats messing with a function call.

I cannot think of why this would be happening, but I might pull and see what I could do about it. For now my recommendation would be to try different models and see if any of them work for you, as I have found that all other models other then the 7B ones work for me.
It might be different on your end.
Hopefully this gets patched.

CharlieFRuan · 2024-03-18T02:27:51Z

Hi @bulutthecat @qc903113684 apologies for the inconvenience. Could you check whether #1955 was included when you ran into this issue? Or perhaps try again with the latest package? I suspect that this is fixed via #1955. Thank you!

Kartik14 · 2024-03-18T06:38:01Z

@qc903113684 Unfortunately, I am unable to reproduce it on my end. Can you please build tvm and mlc again after fetching the latest changes and then recompile the model library?

bulutthecat · 2024-03-18T12:07:56Z

Hi @bulutthecat @qc903113684 apologies for the inconvenience. Could you check whether #1955 was included when you ran into this issue? Or perhaps try again with the latest package? I suspect that this is fixed via #1955. Thank you!

Thanks for letting me know, I will get back to you if it works.

qc903113684 · 2024-03-19T07:59:34Z

this PR may fixed problem, I have no time to test yet. #1850

CharlieFRuan · 2024-03-19T15:25:37Z

Hi @qc903113684, #1850 is superseded by #1822, which was merged 3 weeks ago.

i.e. #1822 and #1955 can both be potential fix to the problem described in this issue

sinaSPOGames · 2024-03-30T04:55:12Z

got same error

MLCChat failed

Stack trace:
org.apache.tvm.Base$TVMError: InternalError: Check failed: (e == CL_SUCCESS) is false: OpenCL Error, code=-54: CL_INVALID_WORK_GROUP_SIZE
Stack trace:
File "/Users/kartik/mlc/tvm/src/runtime/opencl/opencl_module.cc", line 90

at org.apache.tvm.Base.checkCall(Base.java:173)
at org.apache.tvm.Function.invoke(Function.java:130)
at ai.mlc.mlcllm.ChatModule.prefill(ChatModule.java:54)
at ai.mlc.mlcchat.AppViewModel$ChatState$requestGenerate$1$1.invoke(AppViewModel.kt:666)
at ai.mlc.mlcchat.AppViewModel$ChatState$requestGenerate$1$1.invoke(AppViewModel.kt:666)
at ai.mlc.mlcchat.AppViewModel$ChatState.callBackend(AppViewModel.kt:548)
at ai.mlc.mlcchat.AppViewModel$ChatState.requestGenerate$lambda$4(AppViewModel.kt:666)
at ai.mlc.mlcchat.AppViewModel$ChatState.$r8$lambda$lluIrcsPALEW5nCb2tohZYadhTY(Unknown Source:0)
at ai.mlc.mlcchat.AppViewModel$ChatState$$ExternalSyntheticLambda3.run(Unknown Source:6)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:462)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:919)

Error message:
InternalError: Check failed: (e == CL_SUCCESS) is false: OpenCL Error, code=-54: CL_INVALID_WORK_GROUP_SIZE
Stack trace:
File "/Users/kartik/mlc/tvm/src/runtime/opencl/opencl_module.cc", line 90

qc903113684 added the bug Confirmed bugs label Feb 27, 2024

tqchen closed this as completed Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] gemma-2b for Android. OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE #1844

[Bug] gemma-2b for Android. OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE #1844

qc903113684 commented Feb 27, 2024

bulutthecat commented Mar 16, 2024

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

CharlieFRuan commented Mar 18, 2024

Kartik14 commented Mar 18, 2024

bulutthecat commented Mar 18, 2024

qc903113684 commented Mar 19, 2024

CharlieFRuan commented Mar 19, 2024

sinaSPOGames commented Mar 30, 2024

[Bug] gemma-2b for Android. OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE #1844

[Bug] gemma-2b for Android. OpenCL Error Code=-54: CL_INVALID_WORK_GROUP_SIZE #1844

Comments

qc903113684 commented Feb 27, 2024

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

bulutthecat commented Mar 16, 2024

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

CharlieFRuan commented Mar 18, 2024

Kartik14 commented Mar 18, 2024

bulutthecat commented Mar 18, 2024

qc903113684 commented Mar 19, 2024

CharlieFRuan commented Mar 19, 2024

sinaSPOGames commented Mar 30, 2024