Custom tag incorrectly interpreted by -json option as a number

Started by omarserenity, June 20, 2024, 11:54:55 PM

Previous topic - Next topic

omarserenity

I'm running ExifTool 12.76 on Ubuntu 24.04

I have some a custom tag set up in my ExifTool config:
Fingerprint => {
            Require => {
                0 => 'FileName',
                1 => 'Directory',
            },
            ValueConv => q{
                my $fingerprint = `FingerprintPhotos "$val[1]/$val[0]"`;
                chomp $fingerprint;
                return $fingerprint;
            },
        }

FingerPrintPhotos is a python script that uses imagehash library to get a fingerprint of an image:
#!/usr/bin/python3

# import the necessary packages
from PIL import Image
import imagehash
import os
import sys


#import shelve
#import glob

def GetImageHash(imagePath):
    image = Image.open(imagePath)
    return str(imagehash.dhash(image))



if len(sys.argv) != 2:
    print(len(sys.argv))
    raise ValueError("FingerprintPhotos requires one argument: ImagePath")
   

print(GetImageHash(sys.argv[1]))

The problem is, when I use the -json option in ExifTool and include -Fingerprint, if the fingerprint value looks like it could possibly be a number output by Bash, its value won't be quoted in the json and then gets interpreted as a number by jq when I try to convert my json output file to csv.

An example is a fingerprint value of 12109050303470e4 is stored in the json file as unquoted:
"Fingerprint": 12109050303470e4,

So, when I use jq to convert the json file to csv, those Fingerprints that are unquoted get interpreted as numbers, like this:
1.2109050303470E+17

Is this is Bash problem, an ExifTool problem or a problem with my script or config file?  I've tried using PrintConv => instead of ValueConv => in my config, but it made no difference.



StarGeek

"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype


StarGeek

Yes, from the docs on the -csv option
QuoteNote that this option is fundamentally different than all other output format options because it requires information from all input files to be buffered in memory before the output is written. This may result in excessive memory usage when processing a very large number of files with a single command.

Maybe add a leading character to the output from your phash script to force it to be a string. Then you can strip it away later in the process. Or maybe during the conversion. A quick search finds this StackOverflow answer which uses RegEx to strip away leading characters.  Let's say you prefix a "P" to your scripts output. The filter would be something like this I think

.Fingerprint  |= sub("^P"; "")
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

omarserenity

Quote from: StarGeek on June 21, 2024, 10:33:55 AMYes, from the docs on the -csv option
QuoteNote that this option is fundamentally different than all other output format options because it requires information from all input files to be buffered in memory before the output is written. This may result in excessive memory usage when processing a very large number of files with a single command.

Maybe add a leading character to the output from your phash script to force it to be a string. Then you can strip it away later in the process. Or maybe during the conversion. A quick search finds this StackOverflow answer which uses RegEx to strip away leading characters.  Let's say you prefix a "P" to your scripts output. The filter would be something like this I think

.Fingerprint  |= sub("^P"; "")

Thanks. That's exactly what I was thinking of doing. Thanks, StarGeek

Phil Harvey

In ExifTool 12.88 I'll expand the API StructFormat "JSONQ" setting to quote all JSON values.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).