Consider spans in output

In the output of `split_parser`, `split` and `parser` we have an output of tokens and predictions.

It may be worth considering a different type of output with the spans of each reference/token rather than the tokens themselves.