moderated #suggestion feature request to strip out duplicate attachments in replies #suggestion


KWKloeber
 

Mark

Re-reading the exchange let me clarify that I am usually curt and to the point - I meant nothing derogatory in the original comment.  Rather you HAVE to make money.  Hopefully LOTS of money so the platform can continue ..... nothing nefarious or diabolical about that.  So I would hope that, while improving the service, you are concentrating on those things to cement the future of the platform.

NOW that said, my experience is that there is oftentimes a wide, gray, line between whether it's better to "optimize and improve" existing services/features than invent new features.  Just my personal leaning is that I wish that (as much?, more?) effort was put into fixing (ok, improving/optimizing) searching, some image issues (what else is on the list of existing features?) that I "miss" operating as well as or efficiently as possible, versus new features that I don't "miss" because I've never had them (such as the app development.) 
Just my .02 here.
Thx for all the hard work you do.


On Tue, Apr 6, 2021 at 07:32 PM, KWKloeber wrote:
Sure, naturally it would depend on how many of a group's members repost pics and how much storage gets "wasted" by that but at some point groups out of storage need to either laboriously go thru and delete, or moderate every msg, or ante up to the level -- which adds to the bottom line.
Not just for reposting -- It would be handy to be able to do that even when composing a new msg, and not need to upload a new image.  Yah one can always throw in a link to a pic, but it ain't the same as having it display in-line on the msg.

Thx
-k


On Tue, Apr 6, 2021 at 06:02 PM, Mark Fletcher wrote:
On Tue, Apr 6, 2021 at 8:01 AM KWKloeber via groups.io <KWKloeber=aol.com@groups.io> wrote:

Realizing of course that this would negatively impact the business model bottom line I’m not all that hopeful!
 
Were I just such a diabolical business man. Sorry to disappoint.
 
The truth is that this would be a pretty extensive change to the code base, with the associated risk. I'd like to do it, sure. But I can't see how this would be higher priority right now than the other things I'm working on, like bug fixes, the app, and new features.
 
Actually, I suspect it wouldn't affect the bottom line much one way or the other.
 
Thanks,
Mark


 

David wrote:

a) No need to store the attachments on the server multiple time
b) No need to email the same attachments multiple times.
Hmm... Taken together with Mark's comment I think that brings us back to the original request from Donald (the OP): to strip the duplicates.

But for embedded images maybe only in the case of a detected top-post? That is, avoid stripping out a deliberately quoted image, to address Ken S's concern.

There was a problem in one of my groups where someone attached many MB
of images, then modified his post a dozen or so times. Some people
were screaming at the volume of mail that they were downloading. This
was one of the reasons I have disabled editing of posts on my groups.
Yeah... I think editing is a different issue which may require a different solution. If the images were inline in the text stripping them out of the edited text might severely compromise the member's intended meaning.

Shal


 

Bruce,

I'm skeptical that this is actually true. In a group export, all
attachments are BASE64 encoded and included as part of the
messages.mbox file. They may reside in separate MIME parts within each
message but that's all.
This may have changed, since the post is from 2015, but I'm guessing only in detail:

"When we receive an email, we pull all the attachments out of the
message and store them elsewhere (Amazon S3) while storing the
(hopefully now much smaller) email in a database. When viewing the
original message, or downloading the archives, or editing a message,
we reverse the process, pulling the attachments from S3 to rebuild the
original email."
https://beta.groups.io/g/main/message/2512

Hence my "how hard could it be?" attitude about implementing this feature. At least going forward; I could see how it would be a major pain to try and apply it retroactively to references kept hither and yon throughout the database.

But "how hard could it be?" is an infamous phrase, and I've been bitten by it time and again. My suggestion has at least one booby-trap: it likely requires reference counting or an equivalently messy way to know when it is safe to delete the target storage.

So I'll take Mark's word at face value.

Shal


Bruce Bowman
 

On Tue, Apr 6, 2021 at 04:55 PM, Shal Farley wrote:
Images and attachments already occupy separate storage from the body text of the message.

Shal -- I'm skeptical that this is actually true. In a group export, all attachments are BASE64 encoded and included as part of the messages.mbox file. They may reside in separate MIME parts within each message but that's all.*

If so, implementation of this kind of functionality would likely require a complete rework of the groups.io message base (with, as Mark said, significant risk of breaking something else).

Regards,
Bruce

*How attachments are indexed so as to provide such functionality as the Emailed Photos folder is unclear to me.


KWKloeber
 

Sure, naturally it would depend on how many of a group's members repost pics and how much storage gets "wasted" by that but at some point groups out of storage need to either laboriously go thru and delete, or moderate every msg, or ante up to the level -- which adds to the bottom line.
Not just for reposting -- It would be handy to be able to do that even when composing a new msg, and not need to upload a new image.  Yah one can always throw in a link to a pic, but it ain't the same as having it display in-line on the msg.

Thx
-k


On Tue, Apr 6, 2021 at 06:02 PM, Mark Fletcher wrote:
On Tue, Apr 6, 2021 at 8:01 AM KWKloeber via groups.io <KWKloeber=aol.com@groups.io> wrote:

Realizing of course that this would negatively impact the business model bottom line I’m not all that hopeful!
 
Were I just such a diabolical business man. Sorry to disappoint.
 
The truth is that this would be a pretty extensive change to the code base, with the associated risk. I'd like to do it, sure. But I can't see how this would be higher priority right now than the other things I'm working on, like bug fixes, the app, and new features.
 
Actually, I suspect it wouldn't affect the bottom line much one way or the other.
 
Thanks,
Mark


 

On Tue, Apr 6, 2021 at 8:01 AM KWKloeber via groups.io <KWKloeber=aol.com@groups.io> wrote:

Realizing of course that this would negatively impact the business model bottom line I’m not all that hopeful!


Were I just such a diabolical business man. Sorry to disappoint.

The truth is that this would be a pretty extensive change to the code base, with the associated risk. I'd like to do it, sure. But I can't see how this would be higher priority right now than the other things I'm working on, like bug fixes, the app, and new features.

Actually, I suspect it wouldn't affect the bottom line much one way or the other.

Thanks,
Mark


 

Hi Ken S,

My concern is that some of our users will look at the latest post on a
subject and simply move on to the next if they don't see the image
that was being discussed rather than going back through the thread to
see the image.
In what I proposed it would still be shown there.

I proposed consolidating the storage rather than "stripping out" the duplicate.

As a dabbler in software, I would also believe that the code to keep
the image attached to posts going out in e-mail or on the site while
removing them from the stored post would be rather difficult or
require a dual database,
I'm not quite sure what difficulty you're seeing.

Images and attachments already occupy separate storage from the body text of the message. In the web view it would be rather simple to have multiple instances of the displayed image point to the same image file storage. If you're a dabbler in HTML code the storage is pointed to by the URL used in the src attribute of the <img> tag in whatever page or text displays the image.

In messages sent out by email I know that some processing of the content is done by Groups.io, but I don't know the details. However I'm confident that Mark can ensure that consolidating the storage doesn't "break" the content; whether for individual or digest.

Shal


David Kirkby
 

On Tue, 6 Apr 2021 at 17:01, Ken Schweizer <kensch888@...> wrote:
My concern is that some of our users will look at the latest post on a subject and simply move on to the next if they don't see the image that was being discussed rather than going back through the thread to see the image.

I don’t really see how that’s any different from people reading any post where someone has trimmed off text that was previously written. It’s generally considered good practice to reply to a post, quoting what one is replying to, and not of other things. 

Looking at your post, nobody would make any sense of it unless they had read the original text. I fail to see how images are any different. 


As a dabbler in software, I would also believe that the code to keep the image attached to posts going out in e-mail or on the site while removing them from the stored post would be rather difficult or require a dual database, so I thought I'd throw the option out for

I don’t see the need for two databases, as I would prefer if the server didn’t email out the images. There is 

a) No need to store the attachments on the server multiple time
b) No need to email the same attachments multiple times. 

There was a problem in one of my groups where someone attached many MB of images, then modified his post a dozen or so times. Some people were screaming at the volume of mail that they were downloading. This was one of the reasons I have disabled editing of posts on my groups. 

Again thanks for replying,
Ken S

Dave Kirkby 
--
Dr. David Kirkby,
Kirkby Microwave Ltd,
drkirkby@...
https://www.kirkbymicrowave.co.uk/
Telephone 01621-680100./ +44 1621 680100

Registered in England & Wales, company number 08914892.
Registered office:
Stokes Hall Lodge, Burnham Rd, Althorne, Chelmsford, Essex, CM3 6DT, United Kingdom


Ken Schweizer
 

Hi Shal,

You quoted me, so I'll answer. Though I'm not sure you're addressing the
implementation that I was discussing with ken (K).

> If this feature is implemented it should have an "owner's switch" to
> enable or disable it for their group.

Why would you want to turn it off?
My concern is that some of our users will look at the latest post on a subject and simply move on to the next if they don't see the image that was being discussed rather than going back through the thread to see the image.

As a dabbler in software, I would also believe that the code to keep the image attached to posts going out in e-mail or on the site while removing them from the stored post would be rather difficult or require a dual database, so I thought I'd throw the option out for consideration. If you see a way around these situations then disregard my suggestion. Maybe I'm just trying to keep the responsibility in the hands of the owner.

Again thanks for replying,
Ken S

“You do what you can for as long as you can, and when you finally can’t, you do the next best thing. You back up but you don’t give up.” ―Chuck Yeager


KWKloeber
 

>>> As such, what I suggested differs in that significant respect from what Donald (the OP) suggested; but it solves the same problem (excess storage charges due to duplicate images), removing the need for moderator busy-work.
<<<


Shal-
Realizing of course that this would negatively impact the business model bottom line I’m not all that hopeful!

Ken K


David Kirkby
 

On Sun, 4 Apr 2021 at 00:50, Donald Hellen <donhellen@...> wrote:
Sometimes a member posts a picture (or file attachment) with some text
in the message and others reply to that message and don't delete the
picture, so it gets posted several times, each time taking up
additional space in picture storage.

I'd like to see something that would strip out pictures after the
initial one was posted as long as it had the same exact file size. The
same might be possible for file attachments.
 
I'd like to see that too.  I guess an md5 checksum of the file would be the best, as that ensures the file is indeed identical or different. If not, your suggestion of the same size would be pretty damm good. Even if someone modifies a jpeg, the chances of their saved version being the same size is small. For a bitmap (bmp) though, which I think is rarely used, a modification of the same file would result in the same size, but a different checksum.

Dave


 

Ken S,

You quoted me, so I'll answer. Though I'm not sure you're addressing the implementation that I was discussing with ken (K).

If this feature is implemented it should have an "owner's switch" to
enable or disable it for their group.
Why would you want to turn it off?

What I suggested makes no changes to how messages are handled by email, nor any changes in how they are presented on the web site. It changes only a detail of how attachments and images (inline or attached) are stored. And in particular it potentially reduces the group's storage charges, which seems only a benefit.

As such, what I suggested differs in that significant respect from what Donald (the OP) suggested; but it solves the same problem (excess storage charges due to duplicate images), removing the need for moderator busy-work.

Shal


Ken Schweizer
 

If this feature is implemented it should have an "owner's switch" to enable or disable it for their group.

JMO,
Ken S

“Some people occasionally stumble across the truth. But then they pick themselves right up and move on like nothing happened.”
Winston Churchill

-----Original Message-----
From: main@beta.groups.io [mailto:main@beta.groups.io] On Behalf Of Shal
Farley
Sent: Sunday, April 4, 2021 9:54 PM
To: main@beta.groups.io
Subject: Re: [beta] #suggestion feature request to strip out duplicate
attachments in replies

ken,

> IMO, ideal would be for the quoted message in the reply to
> automatically display the original image, rather than post a new copy
> or a broken photo icon.

That would be feasible in the web view of the messages. The reply would
have the URL of the original image placed in its HTML <img> element,
replacing whatever src reference it had on arrival.

When the reply is sent out to members by email it may be best to leave
the quoted image as it was received, to avoid the various ills
associated with sending it as an https access.

> I have no clue how coding would accomplish that, but we're using code
> to drill holes in rocks on Mars right now. That seems a bit more
> difficult.

Different types of difficulties, but yeah.

Were Groups.io to store a hash value with each image it ought to be
pretty straightforward to see if any image file incoming in a reply
matches an image file already stored in this group (or topic).

Shal




 

ken,

IMO, ideal would be for the quoted message in the reply to
automatically display the original image, rather than post a new copy
or a broken photo icon.
That would be feasible in the web view of the messages. The reply would have the URL of the original image placed in its HTML <img> element, replacing whatever src reference it had on arrival.

When the reply is sent out to members by email it may be best to leave the quoted image as it was received, to avoid the various ills associated with sending it as an https access.

I have no clue how coding would accomplish that, but we're using code
to drill holes in rocks on Mars right now. That seems a bit more
difficult.
Different types of difficulties, but yeah.

Were Groups.io to store a hash value with each image it ought to be pretty straightforward to see if any image file incoming in a reply matches an image file already stored in this group (or topic).

Shal


KWKloeber
 

IMO, ideal would be for the quoted message in the reply to automatically display the original image, rather than post a new copy or a broken photo icon.

I have no clue how coding would accomplish that, but we're using code to drill holes in rocks on Mars right now.  That seems a bit more difficult.

-ken  


Bruce Bowman
 

On Sat, Apr 3, 2021 at 07:55 PM, Donald Hellen wrote:
I'd like to see something that would strip out pictures after the
initial one was posted as long as it had the same exact file size. The
same might be possible for file attachments.
As a friendly amendment, may I suggest restricting this function to previous messages within the same topic?

This would seem to satisfy your use case without searching the entire message base.

I think it would be very rare that a subsequent different picture or
file posted in a reply would be the same file size.
If restricted to a single topic, a bytewise comparison with previous attachments may be possible. Failing that, I'd prefer to apply a hashing function of some kind and not rely solely on file size for dupe detection.

Regards,
Bruce


Donald Hellen
 

Sometimes a member posts a picture (or file attachment) with some text
in the message and others reply to that message and don't delete the
picture, so it gets posted several times, each time taking up
additional space in picture storage.

I'd like to see something that would strip out pictures after the
initial one was posted as long as it had the same exact file size. The
same might be possible for file attachments.

I think it would be very rare that a subsequent different picture or
file posted in a reply would be the same file size.

This would save moderators from going to the group site and deleting
the attachments from the follow-up replies. It would be better than
setting up things to moderate all posts with attachments.

Donald


----------------------------------------------------
Some ham radio groups you may be interested in:
https://groups.io/g/ICOM https://groups.io/g/Ham-Antennas
https://groups.io/g/HamRadioHelp https://groups.io/g/Baofeng
https://groups.io/g/CHIRP https://rf-amplifiers.groups.io/g/main