Character set not copied when copying IPTC data between two JPEG files?

Started by Mac2, November 09, 2013, 08:08:05 AM

Previous topic - Next topic

Mac2

I have a source file with this IPTC data (stripped):

Envelope Record Version         : 4
Coded Character Set : UTF8
Caption-Abstract : výlet Řevničov - Lašovice


I copy the IPTC record from the source file to the target file with the following ARGs file:

-m
-overwrite_original_in_place
-tagsfromfile
source.jpg
-iptc:all
target.jpg


The resulting Caption-Abstract in target.jpg looks like this:

Caption-Abstract: výlet ?evni?ov - Lašovice


Which is obviously wrong. Note that a) no character set encoding is set in the target file and b) ExifTool reports:

Warning: Some character(s) could not be encoded in Latin

Does ExifTool not copy the UTF8 character set field from the source to the target when the entire IPTC record is copied?
I can fix it by changing the ARGS file to use an explicit output character set:

-m
-overwrite_original_in_place
-tagsfromfile
source.jpg
-iptc:all
-codedcharacterset=utf8
target.jpg


Is this what has to be done (OK for me) or should ExifTool copy the characterset encoding automatically?
To me it looks as if ExifTool retains the character set in the destination file and tries to convert what's in the source file. Failing so, it just copies and reports a Warning.


Phil Harvey

The CodedCharacterSet is marked as "unsafe" for copying.  See the Notes for CodedCharacterSet in the IPTC Tags documentation for an explanation.

If you are sure the target file doesn't contain an incompatible encoding, then the right thing to do is set CodedCharacterSet to "UTF8" when you copy the IPTC , as you have done in your second args file.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

Hi, Phil

I've read the encoding documentation before posting. Even after re-reading, I'm not sure that I understand all implications...

I copy the entire IPTC record from the source to the target file. Is there other data which may be present in the target file and which may be affected by the IPTC codedCharacterSet? I thought that iptc:all covers all IPTC data, and thus converting the character set on-the-fly while copying is safe.

And, is it not more dangerous to retain whatever encoding is present in the target file while copying an UTF-8 IPTC record? (not specifying a target encoding?

herb

Hello Phil, hello Mac2,

sorry when I interrupt for a technical question.
I know that option -unsafe also copies all "unsafe" tags.

But does exiftool -tagsfromfile <sourcefile> -iptc:all -unsafe <targetfile>
- copy "all IPTC tags" and "all unsafe" tags or
- copy "all IPTC tags" and "all unsafe tags in IPTC group"?

Thanks for your comments in advance.
Best regards
Herb

Phil Harvey

Mac2: The problem scenario is if the target file contains, say, a lot of Latin-encoded IPTC.  Then, say, if the source file contains only IPTC:Keywords, the encoding in the rest of the target IPTC tags will be messed up if you write CodedCharacterSet.  In this case, you should probably to do something like this:

exiftool -tagsfromfile @ -iptc:all -tagsfromfile SOURCE -iptc:all -codedcharacterset=utf8 DESTINATION

Herb: The Unsafe shortcut only includes unsafe EXIF tags.  You are right that it would probably make sense if it included CodedCharacterSet, but I wasn't thinking about IPTC when I created that shortcut.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

Ah, got it.

-iptc:all does not replace the IPTC record in DEST but merges the data from SOURCE. Did not think about this.

Would something like this also work? Delete the IPTC record in DEST and then copy over from SRC and convert to UTF8?

exiftool -iptc:all= -tagsfromfile src.jpg -iptc:all -codedcharacterset=utf8 dst.jpg



Phil Harvey

Yes.  If you want to delete all existing IPTC in the target before copying in the IPTC from the other file, that is the way to do it.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

Very good, thank you.

Would the same pattern apply when copying EXIF data? Does ExifTool here also merge the tags from the SOURCE and the DEST?

Phil Harvey

When copying tags from one file to another, ExifTool never deletes tags in the destination unless they are specifically overwritten by tags that you copy.  The same applies to all metadata formats.

I think your confusion is that when copying, -GROUP:all copies all (writable and not unsafe) individual tags from the specified GROUP.

- Phil

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

Thanks, Phil

So I need to remember that to replace a complete GROUP, I need to include a -GROUP:all= in my command line. I will change my code accordingly because a merge is often not desirable.

herb

Hello,

please allow an additional (technical) question:
In version 8.94 the following feature was introduced: Added ability to read/write IPTC as a block.

Therefore I thought that IPTC metadata could be replaced  and I also thought this is done using option -iptc:all.
But this thread tells me that I am wrong.

How does the command look like that replaces IPTC as a block?

Best regards
Herb

Phil Harvey

Hi Herb,

You're right, I should have mentioned this.  Extracting/copying IPTC as a block is done with -iptc.  See the Extra tags documentation for all of the block tags.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).