Hi, When I apply the hyper parameters of the 24M model as specified in the table below, I obtain a model with only ~2M parameters. Thoughts? 