support ascend w8a8 graph_mode #3267

yao-fengchen · 2025-03-17T09:13:46Z

support dlinfer smooth_quant
suppprt ascend w8a8 graph_mode
In order to share the per_channel_quant kernel with cuda, move the kernel from cuda folder to default folder
In order to find the dlinfer folder when load_func, modify FunctionDispatcher
fix dynamo error in QTensor

grimoire · 2025-03-28T02:29:46Z

lmdeploy/lite/apis/smooth_quant.py

@@ -7,9 +7,12 @@
 import torch
 from torch import nn

+import lmdeploy.pytorch.devices.device_manager as device_manager


It is not a good idea to import pytorch module in lite.

I modified this in the subsequent commit.

grimoire · 2025-03-31T04:01:51Z

lmdeploy/serve/vl_async_engine.py

@@ -6,7 +6,7 @@
 import PIL

 from lmdeploy.messages import PytorchEngineConfig, TurbomindEngineConfig, VisionConfig
-from lmdeploy.pytorch.check_env import try_import_deeplink
+from lmdeploy.pytorch.check_env import check_env_deeplink


do not import pytorch in serve

yao-fengchen force-pushed the ascend_w8a8 branch from 5120be9 to e823c48 Compare March 19, 2025 07:10

jinminxi104 marked this pull request as ready for review March 25, 2025 12:27

jinminxi104 marked this pull request as draft March 25, 2025 12:33

yao-fengchen force-pushed the ascend_w8a8 branch from e823c48 to 5ffb3b3 Compare March 26, 2025 02:22

yao-fengchen changed the base branch from main to dev March 26, 2025 02:22

yao-fengchen added 2 commits March 27, 2025 09:10

support ascend w8a8 graph_mode

2e48f39

support dlinfer smooth_quant

31b1e9d

yao-fengchen force-pushed the ascend_w8a8 branch 2 times, most recently from 2137f28 to dcb3a1e Compare March 27, 2025 10:40

update code

4c9ead8

yao-fengchen force-pushed the ascend_w8a8 branch from dcb3a1e to 4c9ead8 Compare March 27, 2025 11:10

jinminxi104 requested review from lvhan028 and grimoire March 27, 2025 13:32

jinminxi104 approved these changes Mar 27, 2025

View reviewed changes

jinminxi104 marked this pull request as ready for review March 27, 2025 13:33

grimoire reviewed Mar 28, 2025

View reviewed changes

add try_import_deeplink in utils

f287fca

yao-fengchen requested a review from jinminxi104 March 31, 2025 03:31

grimoire reviewed Mar 31, 2025

View reviewed changes

remove pytorch module in serve

e8638c8

grimoire approved these changes Mar 31, 2025

View reviewed changes

jinminxi104 approved these changes Mar 31, 2025

View reviewed changes

lvhan028 added the enhancement New feature or request label Apr 1, 2025

lvhan028 merged commit c02dd78 into InternLM:dev Apr 1, 2025
3 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support ascend w8a8 graph_mode #3267

support ascend w8a8 graph_mode #3267

yao-fengchen commented Mar 17, 2025 •

edited

Loading

grimoire Mar 28, 2025

yao-fengchen Mar 31, 2025

grimoire Mar 31, 2025

yao-fengchen Mar 31, 2025

support ascend w8a8 graph_mode #3267

support ascend w8a8 graph_mode #3267

Conversation

yao-fengchen commented Mar 17, 2025 • edited Loading

grimoire Mar 28, 2025

Choose a reason for hiding this comment

yao-fengchen Mar 31, 2025

Choose a reason for hiding this comment

grimoire Mar 31, 2025

Choose a reason for hiding this comment

yao-fengchen Mar 31, 2025

Choose a reason for hiding this comment

yao-fengchen commented Mar 17, 2025 •

edited

Loading