-tagsFromFile through XML misses some tags, some times

Started by rjlittlefield, September 02, 2015, 09:45:31 PM

Previous topic - Next topic

rjlittlefield

This is a follow-up to https://exiftool.org/forum/index.php/topic,6627.msg33056.html .

I implemented the two-phase copy that we discussed there. 

Everything looked good for a while, and then a user reported to me that some of his IPTC tags were not getting preserved.

Here's a summary of what goes wrong (transcript of a Windows cmd session).

Quote
exiftool -ver
10.00

Rem ## The source file contains 4 tags in group XMP-iptcCore.

exiftool -a -G1 -XMP-iptcCore:all source.jpg
[XMP-iptcCore]  Location                        : Southwest National Park
[XMP-iptcCore]  Creator Region                  : Tasmania
[XMP-iptcCore]  Creator Country                 : Australia
[XMP-iptcCore]  Creator Work URL                : www.GrantDixonPhotography.com.au


Rem ## Using -tagsFromFile direct from source to target preserves all 4 tags
Rem ## though it does rearrange them.

copy/y NoExif.jpg target.jpg
        1 file(s) copied.

exiftool -tagsFromFile source.jpg target.jpg
    1 image files updated

exiftool -a -G1 -XMP-iptcCore:all target.jpg
[XMP-iptcCore]  Creator Country                 : Australia
[XMP-iptcCore]  Creator Region                  : Tasmania
[XMP-iptcCore]  Creator Work URL                : www.GrantDixonPhotography.com.au
[XMP-iptcCore]  Location                        : Southwest National Park


Rem ## Two-phase copy through XML preserves only one tag...

copy/y NoExif.jpg target.jpg
        1 file(s) copied.

exiftool -all -makernotes -X -b source.jpg > source_tags.xml

exiftool -tagsFromFile source_tags.xml -all:all target.jpg
    1 image files updated

exiftool -a -G1 -XMP-iptcCore:all target.jpg
[XMP-iptcCore]  Location                        : Southwest National Park

Rem ## ...even though all four tags are present in the XML.

findstr iptcCore source_tags.xml
  xmlns:XMP-iptcCore='http://ns.exiftool.ca/XMP/XMP-iptcCore/1.0/'
<XMP-iptcCore:Location>Southwest National Park</XMP-iptcCore:Location>
<XMP-iptcCore:CreatorRegion>Tasmania</XMP-iptcCore:CreatorRegion>
<XMP-iptcCore:CreatorCountry>Australia</XMP-iptcCore:CreatorCountry>
<XMP-iptcCore:CreatorWorkURL>www.GrantDixonPhotography.com.au</XMP-iptcCore:CreatorWorkURL>


Rem ## Two-phase copy through .jpg instead of XML preserves all 4 tags.

copy/y NoExif.jpg tags.jpg
        1 file(s) copied.

copy/y NoExif.jpg target.jpg
        1 file(s) copied.

exiftool -tagsFromFile source.jpg -all:all tags.jpg
    1 image files updated

exiftool -tagsFromFile tags.jpg -all:all target.jpg
    1 image files updated

exiftool -a -G1 -XMP-iptcCore:all target.jpg
[XMP-iptcCore]  Creator Country                 : Australia
[XMP-iptcCore]  Creator Region                  : Tasmania
[XMP-iptcCore]  Creator Work URL                : www.GrantDixonPhotography.com.au
[XMP-iptcCore]  Location                        : Southwest National Park


The source.jpg and NoExif.jpg files used for this illustration can be downloaded from here:
http://janrik.net/temp/ExifTool/20150902/NoExif.jpg
http://janrik.net/temp/ExifTool/20150902/source.jpg

The problem also occurs with a TIFF file from the same user, and I was able to generate several variations of it by creating new files and editing his files in Photoshop.  Sometimes omitting -makernotes when generating the XML solves the problem, but sometimes it doesn't, and in any event we know from the other thread that -makernotes is required in that case to handle certain tags as desired.

I'm reporting this problem mainly so that it's documented.

For my own work, I'm switching strategies to avoid XML.  Using -tagsFromFile between image files seems more reliable, so I'm switching to that approach instead.

I hope this is helpful.  Thanks again for your assistance!

Best regards,
--Rik

Phil Harvey

Hi Rik,

Right.  Preserving structured information through the -X output is a bit tricky, and I need to look at this in more detail.  You should add -struct to the -X output to preserve structured information, but this doesn't solve your specific problem because adding -all:all to the import command somehow breaks the import of this XMP-iptcCore structured information.  I will look into this to see if I can fix it.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

OK.  I've found the problem.  ExifTool 10.01 (to be released soon) should work for you when importing structured XML, so all you will need to do is add -struct to your -X output.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

rjlittlefield

Excellent -- many thanks!

Now I would ask for some advice about what strategy I should be using.

For two-phase copy, I really like the idea of using an intermediate format that is human-readable and works for all formats of images.  XML seemed perfect for that.

But now after not very much experience I've run into two instances of strange behavior with XML, where direct image-to-image copying of tags worked as desired.

Of course new problems can turn up at any time, but I'm wondering what is the difference in risk between the two approaches.

Is going through XML close to exactly the same as going direct image-to-image, or does going through XML involve a lot of different code and perhaps some other limitations/restructurings that I haven't run into yet? 

Phrasing it differently, have I just tripped over a couple of weird cases in getting started, or is going with XML taking me into largely unexplored territory?

Again, thanks!

--Rik



Phil Harvey

Hi Rik,

ExifTool doesn't officially support XML, so reading back the -X output involves a bit of magic to get the tags in the proper places.

Usually a binary format would be preferable, but I see why human readable may be desirable.

Having said this, I think XML should do what you want.   The biggest difference is that you need to get the extraction settings the same as they would be if you were copying tags directly.  When copying, the following options are set internally:

Binary (-b)
CoordFormat (-c "%d %d %.8f")
Duplicates (-a)
RequestAll (-api RequestAll)
StrictDate (-api StrictDate)
Struct (-struct)

BTW, version 10.01 is now available.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

rjlittlefield

Thanks for the information and analysis. 

For me I think the part about "ExifTool doesn't officially support XML" seals the deal in favor of binary.  I shall go forth and do it that way.

Thanks again!

--Rik