Duplicate subjects

The message parsing is very rudimentary and assumes any "h3" tag with an "a" child represents the start of a new paper.

If the paper starts elsewhere, I can end up with duplicates.

The fix is probably to make sure that the first block really does look like a paper title.