1000^2 vs 1024^2 bytes per MB in File:FileSize

Started by blue-j, November 22, 2020, 03:49:33 PM

Previous topic - Next topic

blue-j

The print conversion of FileSize uses 1024^2 bytes per MB, and I want to output 1000^2 bytes per MB.  I have tried to solve this myself without luck, and searching the forums did not yield fruit.  Can anyone guide me down the right path?  I'm getting facile with custom composite tags and config, but not quite at the "ExifTool Freak" level!  Thanks in advance.  : )

- J

Phil Harvey

You'll have to write your own FileSize user-defined tag to override the default conversion.  Look at the sample config file for examples, and the ConvertFileSize function for how the current conversion is done.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

blue-j

Apple, Google, Ubuntu, most hard drive manufacturers... many have settled on base 10 file size metrics.  For example:

https://wiki.ubuntu.com/UnitsPolicy

https://support.apple.com/en-us/HT201402

Would you consider either making the change natively (as we appear to be incorrectly use SI units with IEC metrics presently, according to some viewpoints) or adding the option to retrieve SI calculations?

My amateur attempt:

sub ConvertFileSizeSI($)
{
    my $val = shift;
    $val < 2000 and return "$val bytes";
    $val < 10000 and return sprintf('%.1f kB', $val / 1000);
    $val < 2000000 and return sprintf('%.0f kB', $val / 1000);
    $val < 10000000 and return sprintf('%.1f MB', $val / 1000000);
    $val < 2000000000 and return sprintf('%.0f MB', $val / 1000000);
    $val < 10000000000 and return sprintf('%.1f GB', $val / 1000000000);
    return sprintf('%.0f GB', $val / 1000000000);
}



blue-j

(Upon further research, I misspoke.   The IEC in 1998 also abandoned using the conventional terms, as the attachment shows.)

Phil Harvey

I guess I'm living in the past.  I'll consider changing it if other people make this suggestion.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

blue-j

You're not alone!  It's been very contentious matter.  I can see the logic for base 2 because of binary being the basis of computing, but base 10 is nice because it's much easier math (and almost all of us have 10 fingers).  I'm only looking at this because most of my users and myself are Mac (and Ubuntu) users, and both platforms have decided to go base 10.

It seems like it was driven by the storage industry, and I don't know why really, it's not obviously better for sales to me.  Maybe it's because the math is just much easier in the end?  But I can't have Mac users constantly thinking my software sucks because file sizes are wildly different to what they see in the Finder.

v16p20

I think both calculations can be used and displayed. It should only be marked accordingly. The current calculation is "wrong" in this respect because the mismatching prefixes are used (decimal instead of binary).

Phil Harvey

This is true, and I've thought about this, but I don't know how many people know what a MiB is.  Certainly when ExifTool was created not many people knew about this, but I still have rarely seen the binary prefixes being used.

However, maybe this is a good idea, and would be a first step in making the transition to SI (because otherwise people using ExifTool wouldn't be aware of the change).

What do people think?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Alan Clifford

According to a footnote reference in Wikipedia, "Although these prefixes are not part of the SI, they should be used in the field of information technology to avoid the incorrect usage of the SI prefixes."  https://en.wikipedia.org/wiki/Giga-

ls -h on my Mac uses M but uses bytes / 1024²
cellini:mp3_export alan$ ls -lh 12-This\ Old\ Guitar.mp3
-rw-r--r--  1 alan  staff   2.9M 25 Nov 16:38 12-This Old Guitar.mp3
cellini:mp3_export alan$ ls -l 12-This\ Old\ Guitar.mp3
-rw-r--r--  1 alan  staff  3080122 25 Nov 16:38 12-This Old Guitar.mp3


but for df you can chose -h or -H.  But not for du

So it all looks like a mess!

My disk is 500G or 466Gi but these figures are only useful if, for instance, comparing it to file sizes.  They need to be in the same units.

So after that rambling, units of 1000 and M, G etc seem to be the standard.



Phil Harvey

Interesting.  I didn't know about ls -lh.  The Finder displays MB and uses bytes / 1000000.

Eventually I should change to 1000 with SI prefixes, but that change will not be obvious.  I think maybe ExifTool should transition through about a year of using binary prefixes (without changing the numbers) first.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

blue-j

Why not just make the SI one another field?  FileSizeSI...?  I'm being selfish because a custom config to do so is a bit daunting, but it also seems like a rational sibling to your phased approach, perhaps?   I thank you for your review of this matter!   - J

blue-j

(For whatever it's worth, Apple changed this over a decade ago.  I'm curious how they handled the switch socially.  If memory serves, which it too often does not, they didn't make much noise about it at all.  I still remember a dev in my company explaining it to me when 10.6 came out.)

Phil Harvey

I've gone ahead and changed the prefixes in 12.11.  Somewhere around 12.50 I'll make the transition to SI unless someone objects.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

blue-j

Thanks so much!  I appreciate your openness to this.  - J