How To Output JSON using Image::ExifTool Library Module

Started by kennycarruthers, August 22, 2017, 12:37:47 PM

Previous topic - Next topic

kennycarruthers

I'm trying to using the Image::ExifTool Library Module within a Perl script but can't figure out how to get it to generated JSON data. The "Options" field doesn't seem to have any support for the output format in the way that the command line does.

Library Documentation I'm referencing:
https://exiftool.org/ExifTool.html

In the opening example, a hash containing the tag/value pairs is created. But I'd rather a string be created that is similar (or ideally identical) to the JSON that would be treated if you called ExifTool from the command line like so:

exiftool -G -j my_image.jpg

Is that possible?

The reason for this is that I'm doing some preliminary investigation of using PerlEmbed in a C++ program to call through to ExifTool so that I can compare the performance differences between that versus spawning multiple processes and piping back-and-forth to it.

http://perldoc.perl.org/perlembed.html

In the PerlEmbed documentation, they have a very straight forward example of how to call a Perl script that sets a variable to a string, and then the extract that string back into a C-string. So if I can get JSON created above and stored into a Perl String, then should be able to get it into a C-string pretty easily and then do whatever I want with it in my native code. At least that's the theory...

Phil Harvey

The JSON output option and most other text output formats are features of the application, not the library.  The coding of this is very simple, all except for the routine to quote the values if necessary.  I attach a routine for you to use.  This routine uses some non-public functions of Image::ExifTool::XMP, but these functions aren't likely to change in future versions.

#!/usr/bin/perl -w

use Image::ExifTool;
use Image::ExifTool::XMP;

# lookup for JSON characters that we escape specially
my %jsonChar = ( '"'=>'"', '\\'=>'\\', "\t"=>'t', "\n"=>'n', "\r"=>'r' );

#------------------------------------------------------------------------------
# Escape string for JSON
# Inputs: 0) string, 1) flag to force numbers to be quoted too
# Returns: Escaped string (quoted if necessary)
sub EscapeJSON($;$)
{
    my ($str, $quote) = @_;
    unless ($quote) {
        # JSON boolean (true or false)
        return lc($str) if $str =~ /^(true|false)$/i;
        # JSON/PHP number (see json.org for numerical format)
        # return $str if $str =~ /^-?(\d|[1-9]\d+)(\.\d+)?(e[-+]?\d+)?$/i;
        # (these big numbers caused problems for some JSON parsers, so be more conservative)
        return $str if $str =~ /^-?(\d|[1-9]\d{1,14})(\.\d{1,16})?(e[-+]?\d{1,3})?$/i;
    }
    # encode JSON string as Base64 if necessary
    if (Image::ExifTool::XMP::IsUTF8(\$str) < 0) {
        return '"base64:' . Image::ExifTool::XMP::EncodeBase64($str, 1) . '"';
    }
    # escape special characters
    $str =~ s/(["\t\n\r\\])/\\$jsonChar{$1}/sg;
    # escape other control characters with \u
    $str =~ s/([\0-\x1f])/sprintf("\\u%.4X",ord $1)/sge;
    # JSON strings must be valid UTF8
    Image::ExifTool::XMP::FixUTF8(\$str);
    return '"' . $str . '"';    # return the quoted string
}



- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

kennycarruthers

Thanks for the quick reply Phil, much appreciated.

The snippet is working and I have a (very) rough test app running that is using perlembed. I'm far from having a proper implementation given my complete lack of Perl knowledge, but what I do have working is parsing images in roughly half the time compared to invoking exiftool from the command line using the -@ argument to give it an input file containing all file paths.

I'm curious to see how hard it is to get perlembed running separate interpreters within multiple threads because that would help a lot since this is pretty much CPU bound rather than IO bound when run on an SSD.