Skip to content

DINOv2 cannot run on older GPUs (e.g. Nvidia Titan) #7

@MarkChenYutian

Description

@MarkChenYutian

Trying to run UniCeption inference on Titan partition of the cluster results in this error:

Screenshot 2024-11-19 at 7 48 27 PM

Might related to the auto bf16 casting used in UniCeption. Should probably provide a switch in config / auto detect hardware compatibility to decide whether use mix-precision inference.

Full stderr is provided below:

Scripts/UnitTest/test_matching.py:27: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Module/Frontend/Matching.py:468: in estimate
    result = self.context["model"](view1, view2)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
    return forward_call(*args, **kwargs)
Module/Network/UniCeption/uniception/models/factory/match_anything.py:344: in forward
    feat1, feat2 = self._encode_symmetrized(view1, view2)
Module/Network/UniCeption/uniception/models/factory/match_anything.py:298: in _encode_symmetrized
    feat1, feat2 = self._encode_image_pairs(img1, img2, data_norm_type=view1["data_norm_type"])
Module/Network/UniCeption/uniception/models/factory/match_anything.py:277: in _encode_image_pairs
    encoder_output = self.encoder(encoder_input)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
    return forward_call(*args, **kwargs)
Module/Network/UniCeption/uniception/models/encoders/dinov2.py:109: in forward
    features = self.model.forward_features(encoder_input.image)["x_norm_patchtokens"]
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/models/vision_transformer.py:261: in forward_features
    x = blk(x)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
    return forward_call(*args, **kwargs)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:254: in forward
    return super().forward(x_or_x_list)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:112: in forward
    x = x + attn_residual_func(x)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:91: in attn_residual_func
    return self.ls1(self.attn(self.norm1(x)))
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1553: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/torch/nn/modules/module.py:1562: in _call_impl
    return forward_call(*args, **kwargs)
../.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/attention.py:84: in forward
    x = memory_efficient_attention(q, k, v, attn_bias=attn_bias)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py:276: in memory_efficient_attention
    return _memory_efficient_attention(
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py:395: in _memory_efficient_attention
    return _memory_efficient_attention_forward(
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/__init__.py:414: in _memory_efficient_attention_forward
    op = _dispatch_fw(inp, False)
/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/dispatch.py:119: in _dispatch_fw
    return _run_priority_list(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

name = 'memory_efficient_attention_forward', priority_list = deque([<class 'xformers.ops.fmha.decoder.FwOp'>, <class 'xformers.ops.fmha.flash.FwOp'>, <class 'xformers.ops.fmha.cutlass.FwOp'>, <class 'xformers.ops.fmha.small_k.FwOp'>])
inp = Inputs(query=tensor([[[[ 2.7969e+00, -6.8750e-01,  9.1016e-01,  ..., -1.1094e+00,
           -6.9336e-02,  3.0625e+00]...02]]]], device='cuda:0', dtype=torch.bfloat16), attn_bias=None, p=0.0, scale=None, output_dtype=None, is_partial=False)

    def _run_priority_list(name: str, priority_list: Sequence[T], inp: Inputs) -> T:
        not_supported_reasons: List[List[str]] = []
        for op in priority_list:
            not_supported = op.not_supported_reasons(inp)
            if not not_supported:
                return op
            not_supported_reasons.append(not_supported)
    
        # Let's write a nice message explaining what we tried and why it's not supported
        msg = f"""No operator found for `{name}` with inputs:
    {textwrap.indent(_format_inputs_description(inp), '     ')}"""
        for op, not_supported in zip(priority_list, not_supported_reasons):
            msg += "\n" + _format_not_supported_reasons(op, not_supported)
>       raise NotImplementedError(msg)
E       NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
E            query       : shape=(2, 257, 16, 64) (torch.bfloat16)
E            key         : shape=(2, 257, 16, 64) (torch.bfloat16)
E            value       : shape=(2, 257, 16, 64) (torch.bfloat16)
E            attn_bias   : <class 'NoneType'>
E            p           : 0.0
E       `decoderF` is not supported because:
E           requires device with capability > (7, 0) but your GPU has capability (6, 1) (too old)
E           attn_bias type is <class 'NoneType'>
E           bf16 is only supported on A100+ GPUs
E       `[email protected]` is not supported because:
E           requires device with capability > (8, 0) but your GPU has capability (6, 1) (too old)
E           bf16 is only supported on A100+ GPUs
E       `cutlassF-pt` is not supported because:
E           bf16 is only supported on A100+ GPUs
E       `smallkF` is not supported because:
E           max(query.shape[-1] != value.shape[-1]) > 32
E           dtype=torch.bfloat16 (supported: {torch.float32})
E           bf16 is only supported on A100+ GPUs
E           unsupported embed per head: 64

/data2/datasets/yutianch/.conda/envs/AirVIO/lib/python3.11/site-packages/xformers/ops/fmha/dispatch.py:55: NotImplementedError
--------------------------------------------------------------------------------------------------------------------------- Captured stdout call ----------------------------------------------------------------------------------------------------------------------------
Warning, cannot find cuda-compiled version of RoPE2D, using a slow pytorch version instead
Loading pretrained dinov2_vitl14 from torch hub
--------------------------------------------------------------------------------------------------------------------------- Captured stderr call ----------------------------------------------------------------------------------------------------------------------------
Using cache found in /home/yutianch/.cache/torch/hub/facebookresearch_dinov2_main

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions