Limited Context Length - Yarn/Rope?

#17
by rjmehta - opened

This model has a 16k context length. That is way very small given llama qwen supports 128k. Can this model support 64k with scaling factor 4.0 ?

it technically can but you will not that good of results because it was not trained for that size. but be happy the didn't give us the 8k model

Sign up or log in to comment