Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support ascend w8a8 graph_mode #3267

Merged
merged 5 commits into from
Apr 1, 2025
Merged

Conversation

yao-fengchen
Copy link
Collaborator

@yao-fengchen yao-fengchen commented Mar 17, 2025

  1. support dlinfer smooth_quant
  2. suppprt ascend w8a8 graph_mode
  3. In order to share the per_channel_quant kernel with cuda, move the kernel from cuda folder to default folder
  4. In order to find the dlinfer folder when load_func, modify FunctionDispatcher
  5. fix dynamo error in QTensor

@jinminxi104 jinminxi104 marked this pull request as ready for review March 25, 2025 12:27
@jinminxi104 jinminxi104 marked this pull request as draft March 25, 2025 12:33
@yao-fengchen yao-fengchen changed the base branch from main to dev March 26, 2025 02:22
@yao-fengchen yao-fengchen force-pushed the ascend_w8a8 branch 2 times, most recently from 2137f28 to dcb3a1e Compare March 27, 2025 10:40
@jinminxi104 jinminxi104 marked this pull request as ready for review March 27, 2025 13:33
@@ -7,9 +7,12 @@
import torch
from torch import nn

import lmdeploy.pytorch.devices.device_manager as device_manager
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a good idea to import pytorch module in lite.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified this in the subsequent commit.

@@ -6,7 +6,7 @@
import PIL

from lmdeploy.messages import PytorchEngineConfig, TurbomindEngineConfig, VisionConfig
from lmdeploy.pytorch.check_env import try_import_deeplink
from lmdeploy.pytorch.check_env import check_env_deeplink
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not import pytorch in serve

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@lvhan028 lvhan028 added the enhancement New feature or request label Apr 1, 2025
@lvhan028 lvhan028 merged commit c02dd78 into InternLM:dev Apr 1, 2025
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants