-
Notifications
You must be signed in to change notification settings - Fork 620
fix qwen3vl ci error #4476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix qwen3vl ci error #4476
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request addresses a CI error related to Qwen3VL by modifying the AscendQwen3_VisionTransformer class to handle grid_thw input more flexibly. While the changes introduce support for both torch.Tensor and list[list[int]] input types, there's an opportunity to improve efficiency and maintainability by streamlining data type conversions and removing an unnecessary dependency.
| from functools import partial | ||
| from typing import Callable, Optional | ||
|
|
||
| import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if isinstance(grid_thw, list): | ||
| grid_thw_list = grid_thw | ||
| grid_thw = np.array(grid_thw, dtype=np.int32) | ||
| else: | ||
| grid_thw_list = grid_thw.tolist() | ||
| grid_thw = grid_thw.numpy() | ||
|
|
||
| pos_embeds = self.fast_pos_embed_interpolate(grid_thw_list) | ||
| hidden_states = hidden_states + pos_embeds | ||
| rotary_pos_emb = self.rot_pos_emb(grid_thw) | ||
| rotary_pos_emb = self.rot_pos_emb(grid_thw_list) | ||
| grid_thw_tensor = torch.tensor(grid_thw, | ||
| device=self.device, | ||
| dtype=torch.int32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of the forward method involves redundant data type conversions for grid_thw. Specifically, if grid_thw is a torch.Tensor, it's converted to a list, then to a numpy array, and finally back to a torch.Tensor. If grid_thw is a list, it's converted to a numpy array and then to a torch.Tensor. These intermediate conversions to numpy are inefficient and unnecessary. The grid_thw_list (a Python list) and grid_thw_tensor (a PyTorch tensor) can be prepared directly from the input grid_thw without involving numpy, improving performance and code clarity.
if isinstance(grid_thw, list):
grid_thw_list = grid_thw
grid_thw_tensor = torch.tensor(grid_thw,
device=self.device,
dtype=torch.int32)
else:
grid_thw_list = grid_thw.tolist()
grid_thw_tensor = grid_thw.to(device=self.device, dtype=torch.int32)
pos_embeds = self.fast_pos_embed_interpolate(grid_thw_list)
hidden_states = hidden_states + pos_embeds
rotary_pos_emb = self.rot_pos_emb(grid_thw_list)| def rot_pos_emb(self, grid_thw: list[list[int]]): | ||
| pos_ids = [] | ||
| max_grid_size = max(max(h, w) for _, h, w in grid_thw) | ||
| for t, h, w in grid_thw: | ||
| hpos_ids = torch.arange(h).unsqueeze(1).expand(-1, w) | ||
| hpos_ids = hpos_ids.reshape( | ||
| h // self.spatial_merge_size, | ||
| self.spatial_merge_size, | ||
| w // self.spatial_merge_size, | ||
| self.spatial_merge_size, | ||
| ) | ||
| hpos_ids = hpos_ids.permute(0, 2, 1, 3) | ||
| hpos_ids = hpos_ids.flatten() | ||
|
|
||
| wpos_ids = torch.arange(w).unsqueeze(0).expand(h, -1) | ||
| wpos_ids = wpos_ids.reshape( | ||
| h // self.spatial_merge_size, | ||
| self.spatial_merge_size, | ||
| w // self.spatial_merge_size, | ||
| self.spatial_merge_size, | ||
| ) | ||
| wpos_ids = wpos_ids.permute(0, 2, 1, 3) | ||
| wpos_ids = wpos_ids.flatten() | ||
| pos_ids.append( | ||
| torch.stack([hpos_ids, wpos_ids], dim=-1).repeat(t, 1)) | ||
| pos_ids = torch.cat(pos_ids, dim=0) | ||
| rotary_pos_emb_full = self.rotary_pos_emb(max_grid_size) | ||
| rotary_pos_emb = rotary_pos_emb_full[pos_ids].flatten(1) | ||
| return rotary_pos_emb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i have deleted these code
|
plz update the pr message and rebase your code to avoid CI failure of prefix caching |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: 李少鹏 <[email protected]>
Signed-off-by: 李少鹏 <[email protected]>
Signed-off-by: 李少鹏 <[email protected]>
Signed-off-by: 李少鹏 <[email protected]>
What this PR does / why we need it?
Does this PR introduce any user-facing change?
How was this patch tested?