You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm interested in understanding the hardware-level support for sparse model acceleration on Snapdragon 8 Gen 2/3 devices, specifically across the CPU, GPU, and AI accelerators (NPU/DSP). Does the platform provide native hardware acceleration for sparse neural networks, such as N:M sparsity patterns or random sparsity?
I notice that AIMET supports model pruning, but I'm unclear whether this sparsity can be effectively leveraged by the underlying Snapdragon hardware. If the hardware doesn't natively support sparse acceleration, are there any frameworks or libraries specifically designed for Snapdragon platforms that can efficiently execute sparse models and deliver actual performance improvements?
Specifically, I'm looking to understand:
Which processing units (CPU/GPU/NPU/DSP) on Snapdragon 8 Gen 2/3 have hardware-level support for sparse models?
What sparsity patterns are supported (structured N:M, unstructured random sparsity, etc.) ?
If hardware support is limited, what software solutions exist to effectively run sparse models on these devices?
Any insights regarding the practical implementation and performance benefits of model sparsity on Snapdragon platforms would be greatly appreciated.
The text was updated successfully, but these errors were encountered:
The model pruning techniques you find in AIMET are in the category of structured pruning - which basically reduces the dimensionality of layers. These techniques are not for exploiting sparsity. Having said this we recommend quantization techniques as opposed to model pruning since the latter generally causes model accuracy drops that need to be recovered via model fine-tuning.
We are not the right experts to comment on HW-level sparsity support on different Snapdragon devices.
I'm interested in understanding the hardware-level support for sparse model acceleration on Snapdragon 8 Gen 2/3 devices, specifically across the CPU, GPU, and AI accelerators (NPU/DSP). Does the platform provide native hardware acceleration for sparse neural networks, such as N:M sparsity patterns or random sparsity?
I notice that AIMET supports model pruning, but I'm unclear whether this sparsity can be effectively leveraged by the underlying Snapdragon hardware. If the hardware doesn't natively support sparse acceleration, are there any frameworks or libraries specifically designed for Snapdragon platforms that can efficiently execute sparse models and deliver actual performance improvements?
Specifically, I'm looking to understand:
Any insights regarding the practical implementation and performance benefits of model sparsity on Snapdragon platforms would be greatly appreciated.
The text was updated successfully, but these errors were encountered: