Does the benchmark in this repository include the time and performance overhead of quantization and dequantization? Specifically, the proportion of dequantization in the total inference time, and the proportion of memory usage consumed by dequantization in the entire inference.
Does the benchmark in this repository include the time and performance overhead of quantization and dequantization? Specifically, the proportion of dequantization in the total inference time, and the proportion of memory usage consumed by dequantization in the entire inference.