A cycle accurate simulation framework for the novel AI inference architecture HERO (A Hybrid GEMM and Direct Convolution Accelerator). A full explanation of Hero's internals is available here (Document still WIP). The backend simulation enviornment is built using SystemC. This simulation framework can be used to estimate the performance of any arbitrary pytorch model (provided that the model has supported layers) on a range of possible configurations for HERO.
- Clone this repo
- Include frontend.py script/ (Migration to pypi in progress)
- Download latest backend release here
- Install backend linux deb package
config = {
"filter_count": 32,
"channel_count": 18,
"directly_supported_kernels": [(1, 1), (3, 3)],
"ifmap_mem_ub": 2**20 // 18 * 18,
"allow_ifmap_distribution": True,
"ofmap_mem_ub": 2**20,
"allow_ofmap_distribution": True,
"reuse_chain_bank_size": 512,
"weight_bank_size": 16,
"groups_supported": False,
}
from frontend import eval_network
result_dataframe = eval_network(model=pytorch_model, arch_config=config)- Layer types are limited to Linear and Conv2D
- DRAM bandwidth restrictions are not considered in latency estimations


