arith.remsi|remui should be handled as tt.addptr inputs as triton-shared does. This would require high level of modification of the pass, as there are cases we are not handling because we do not care about this (see triton-shared pass).
We may first want to do an evaluation of how common this case is in our benchmark suite before putting the effort.
Take inspiration from triton-shared if we decide going for it or reject this issue otherwise.