#suggestion feature request to strip out duplicate attachments in replies #suggestion



I'm skeptical that this is actually true. In a group export, all
attachments are BASE64 encoded and included as part of the
messages.mbox file. They may reside in separate MIME parts within each
message but that's all.
This may have changed, since the post is from 2015, but I'm guessing only in detail:

"When we receive an email, we pull all the attachments out of the
message and store them elsewhere (Amazon S3) while storing the
(hopefully now much smaller) email in a database. When viewing the
original message, or downloading the archives, or editing a message,
we reverse the process, pulling the attachments from S3 to rebuild the
original email."

Hence my "how hard could it be?" attitude about implementing this feature. At least going forward; I could see how it would be a major pain to try and apply it retroactively to references kept hither and yon throughout the database.

But "how hard could it be?" is an infamous phrase, and I've been bitten by it time and again. My suggestion has at least one booby-trap: it likely requires reference counting or an equivalently messy way to know when it is safe to delete the target storage.

So I'll take Mark's word at face value.


