Compress Attachments using single instance store idea

Is it possible to compress attachments using single instance store by taking say ten invoices from the same supplier and only storing a single instance of the embedded fonts and logos. This would make more sense than saving every supplier pdf with default size.

I am not wanting to compress each pdf because you lose quality in picture and if you don’t store the embedded font it really messes up the look of the invoice and font.

But if I am storing dozens of invoices from the same supplier, the logo and embedded font is identical and the embedded fonts may even be identical across a number of suppliers.

When Microsoft Windows Creates Windows Images for deployment it uses a concept called Single Instance Store as do many backup programs.

While I agree that archiving attachments (which is one of my ideas already in the ideas category) should be implemented, I think SIS would address the tremendous waste of space occuring adding attachments.

I have only added 80 supplier invoices to less than ten suppliers and this is already taking up 6mb despite the fact that most of the attachments are between 20kb-60kb with a couple around 100kb.

The vast bulk of the space is embedded fonts and logo images most of which will be duplicated for the same supplier and for fonts may even be duplicated across suppliers. SIS to the rescue!

I don’t think this is a problem. Manager can easily handle 1,000x bigger size.

I know what you mean, the database size can balloon quickly because of duplicate data but this is just a trade-off between performance and database size. Backup programs de-duplicate and compress because performance is not priority - you want your backups to take the least amount of space possible.

For accounting software, performance is a priority. Adding de-duplication and real-time compression would make the program slower and what benefit? Saving space on hard-drive? It’s not a good trade-off considering hard-drives are now at 1 TB which means even if your Manager file would be 1,000x bigger, it would still be less than 1% of your hard-drive capacity.

Also, just because Manager database is uncompressed, it doesn’t affect your backup program. Backup program will still compress your files and Manager database compresses well.

I disagre for two reasons.

I can’t speak for others, but the most common reason that I can see for using attachments is to attach suppliers invoice to purchase invoice. People are very rarely if ever going to be opening these attachments. It is simply the electronic equilavent of printing out all supplier invoices and storing it in a binder file. So speed is really not an issue here. I doubt very much that I will view supplier invoices very often if at all. I just simply want to associate the supplier invoice with my purchase invoice and do away with paper filing.

Take your cloud platform for example. You have say 1000 users on your system using 1mb each. So you are using 1GB on your cloud system to host 1000 users. Now they add 1000 attachments to their purchase invoices. If I use my 80 invoices taking up 6mb, then it means a 1000 attachments will take up 75mb. So your hosting system balloons from 1GB to 75GB just to store attachments.

Granted you might have a 2TB hard drive so it’s not a problem. But I just think increasing the database by 75 times just to store attachments is not forward thinking especially as a lot of companies will have a lot more than 1000 supplier invoices. I am a very small company and I already have 600 invoices. A much bigger company might have 100 times that easily meaning that they alone use almost a GB.

Once people start using attachments more and more, your hosting cloud storage space will grow exponentially.

I am planning on buying the server platform next year as I want my account to have more regular access to my accounts so it won’t be a problem for me. But I just feel that its a waste of space as the vast bulk of the attachment data is embedded fonts and logos. Anyway just a suggestion that I am throwing up.

@dalacor , I totally understand the reasoning behind your idea but I agree with lubos in his stand that for an accounting software, performance is a priority (it’s good for business, time is money).

In my opinion, the amount of money spent for extra storage could still be lesser than the value of time that could be put to wast if performance is slower if we go with that idea.

Your idea is not wrong but it the pros and cons are to be weighted so both parties (subscriber and are in a win win situation. (Just a thought).

@steigen I am not seeing how performance will be affected as attachments are most likely to be added and then never viewed. So there is no performance impact.

It doesn’t bother me as I will be buying the server edition next year, but I don’t see any peformance loss with sis Compression if nobody opens the attachments.

However archiving will be implemented so this will achieve the same result of reducing database size, but in a different way.

@dalacor but you are not proposing compression of individual documents. That wouldn’t solve anything because PDF files are already compressed by default. You are proposing compression of the entire database so it is de-duplicated. (two documents contain the same logo but logo itself is stored just once).

There are compression file systems which can automatically de-duplicate your entire hard-drive (or just a single folder) in real-time (not just Manager database). Even default file system for Windows (NTFS) supports some kind of real-time compression.

Does anybody use it? I don’t know. Because storage is cheap and plentiful, people almost always trade the better speed for storage efficiency.

Yes I hadn’t realised that the entire database would be compressed. But you are right. Like I said I am not concerned as I will be using the server version next year. I think it just took me by surprise the ballooning of space and I automatically start looking for a solution!