command-line export and substitution

Started by pankus, April 06, 2016, 08:14:37 AM

Previous topic - Next topic

pankus

Dear all,
I'm trying to process a very large collection of pictures tagged by FotoStation. I would like to export IPTC data into a csv or a json file to get them in a DB, but in the meantime, I would like also to reformat some of the values.
Thus, for instance, I'm using this command-line instruction
exiftool -iptc:all -j DIR > out.txt
It works very well, but two iptc values need some adjustments. Both the DocumentHistory and ExifCameraInfo have been filled (in Fotostation) by numerous whitespaces and new line characters and I would like to strip these unneeded characters out of my output.
Thus I tried with
exiftool -iptc:all -j DIR -p '${ExifCameraInfo;s/:/__/g}' > out2.txt
but, of course, my out2.txt contains the ExifCameraInfo values, without all other iptc keys.

Which is the correct syntax to manage regexp substitution from command-line interface while exporting the output?
Please any help would be greatly appreciated

Phil Harvey

You can only do substitutions on individual tag values with the -p output formatting, but this can't be combined with other output format styles.

The API filter option will allow you to filter the values of all tags, but this will only work if a single substitution will work for all tags.

I don't understand the substitution you have shown -- it substitutes a colon for two underlines.  I thought you wanted to get rid of white space.  However, the filter option looks like this with the substitution you have given:

exiftool -api filter="s/:/__/g" -iptc:all -j DIR > out.json

Of course, tampering with the colons will mess up the date/time formats, so this specific example certainly won't do what you want.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pankus

thank you so much for your quick answer, Phil.
Of course, the regexp I quickly copied and pasted from my experiments does not represent my true intent...
So, If I well understand, if I adopt the approach above described the substitution is global. On the other hand, by adopting the -p flag, can I operate substitution on single on single tags and then chose all the iptc tags I need overriding other output formats?

thank you again

Francesco

Phil Harvey

Hi Francesco,

The -p option specifies the complete output format.  You could use this option instead of -j, but then you would need to specify all tags individually, and format the JSON output yourself.  For example:

exiftool -p my.fmt DIR > out.json

where my.fmt is:

[{
  "SourceFile": "$directory/$filename",
  "ExifCameraInfo": "${ExifCameraInfo;s/:/__/g}",
  "DocumentHistory": "${DocumentHistory;s/ /_/g}",
  "Keywords": "$keywords"
}]


However, this approach is problematic because some values would cause an invalid JSON output (eg. quotes in values), so some tricky filtering would be necessary to avoid this (the API Filter option was designed for this type of thing).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

pankus

thank you very much Phil
You saved my day

sincerely,
Francesco