How about deduplication first? Eliminate multiple instances of the
same file--perhaps even annotating the messages to indicate where the
file has been retained--reducing space with no loss of information.
Right, that too would be very valuable. If I understand it correctly your idea is roughly equivalent to (3) here:

The only downside I can think of is what Bruce mentioned: it may be too much to implement and test before this sweep.


