-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
bugSomething isn't workingSomething isn't workingenhancementNew feature or requestNew feature or request
Description
Trying to run UniCeption inference on Titan partition of the cluster results in this error:
Might related to the auto bf16 casting used in UniCeption. Should probably provide a switch in config / auto detect hardware compatibility to decide whether use mix-precision inference.
Full stderr is provided below:
Scripts/UnitTest/test_matching.py:27:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Module/Frontend/Matching.py:468: in estimate
result = self.context["model"](view1, view2)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
return forward_call(*args, **kwargs)
Module/Network/UniCeption/uniception/models/factory/match_anything.py:344: in forward
feat1, feat2 = self._encode_symmetrized(view1, view2)
Module/Network/UniCeption/uniception/models/factory/match_anything.py:298: in _encode_symmetrized
feat1, feat2 = self._encode_image_pairs(img1, img2, data_norm_type=view1["data_norm_type"])
Module/Network/UniCeption/uniception/models/factory/match_anything.py:277: in _encode_image_pairs
encoder_output = self.encoder(encoder_input)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
return forward_call(*args, **kwargs)
Module/Network/UniCeption/uniception/models/encoders/dinov2.py:109: in forward
features = self.model.forward_features(encoder_input.image)["x_norm_patchtokens"]
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:261: in forward_features
x = blk(x)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
return forward_call(*args, **kwargs)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:254: in forward
return super().forward(x_or_x_list)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:112: in forward
x = x + attn_residual_func(x)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:91: in attn_residual_func
return self.ls1(self.attn(self.norm1(x)))
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
return forward_call(*args, **kwargs)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/attention.py:84: in forward
x = memory_efficient_attention(q, k, v, attn_bias=attn_bias)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py:276: in memory_efficient_attention
return _memory_efficient_attention(
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py:395: in _memory_efficient_attention
return _memory_efficient_attention_forward(
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py:414: in _memory_efficient_attention_forward
op = _dispatch_fw(inp, False)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/dispatch.py:119: in _dispatch_fw
return _run_priority_list(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'memory_efficient_attention_forward', priority_list = deque([<class 'xformers.ops.fmha.decoder.FwOp'>, <class 'xformers.ops.fmha.flash.FwOp'>, <class 'xformers.ops.fmha.cutlass.FwOp'>, <class 'xformers.ops.fmha.small_k.FwOp'>])
inp = Inputs(query=tensor([[[[ 2.7969e+00, -6.8750e-01, 9.1016e-01, ..., -1.1094e+00,
-6.9336e-02, 3.0625e+00]...02]]]], device='cuda:0', dtype=torch.bfloat16), attn_bias=None, p=0.0, scale=None, output_dtype=None, is_partial=False)
def _run_priority_list(name: str, priority_list: Sequence[T], inp: Inputs) -> T:
not_supported_reasons: List[List[str]] = []
for op in priority_list:
not_supported = op.not_supported_reasons(inp)
if not not_supported:
return op
not_supported_reasons.append(not_supported)
# Let's write a nice message explaining what we tried and why it's not supported
msg = f"""No operator found for `{name}` with inputs:
{textwrap.indent(_format_inputs_description(inp), ' ')}"""
for op, not_supported in zip(priority_list, not_supported_reasons):
msg += "\n" + _format_not_supported_reasons(op, not_supported)
> raise NotImplementedError(msg)
E NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
E query : shape=(2, 257, 16, 64) (torch.bfloat16)
E key : shape=(2, 257, 16, 64) (torch.bfloat16)
E value : shape=(2, 257, 16, 64) (torch.bfloat16)
E attn_bias : <class 'NoneType'>
E p : 0.0
E `decoderF` is not supported because:
E requires device with capability > (7, 0) but your GPU has capability (6, 1) (too old)
E attn_bias type is <class 'NoneType'>
E bf16 is only supported on A100+ GPUs
E `[email protected]` is not supported because:
E requires device with capability > (8, 0) but your GPU has capability (6, 1) (too old)
E bf16 is only supported on A100+ GPUs
E `cutlassF-pt` is not supported because:
E bf16 is only supported on A100+ GPUs
E `smallkF` is not supported because:
E max(query.shape[-1] != value.shape[-1]) > 32
E dtype=torch.bfloat16 (supported: {torch.float32})
E bf16 is only supported on A100+ GPUs
E unsupported embed per head: 64
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/dispatch.py:55: NotImplementedError
--------------------------------------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------------------------------------
Warning, cannot find cuda-compiled version of RoPE2D, using a slow pytorch version instead
Loading pretrained dinov2_vitl14 from torch hub
--------------------------------------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------------------------------------
Using cache found in /home/yutianch/.cache/torch/hub/facebookresearch_dinov2_main
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingenhancementNew feature or requestNew feature or request
