Curious output Iptc from an image

Started by ScannerBoy, July 29, 2020, 09:30:58 PM

Previous topic - Next topic

ScannerBoy

In trying to sort out character sets for an app, I started to display the metadata using different utilities, one of them being Exiftool.
The image I am using for my tests is taken from: https://en.wikipedia.org/wiki/USS_Ronald_Reagan in the top right corner.
There are a number of text string embedded in some character encoding, which Exiftools identifies as UTF8, but it does not appear to convert the strings which contain what appear to be UTF8 characters when dumping the results to the screen
One set of these lines outputs the lines:
Coded Character Set             : UTF8
By-line                         : Photographer├¡s Mate 3rd Class (
-----------------------------------------------==

Presumably intended to be displayed as "Photographer's Mate ...."
but also apparently badly encoded by the originators.
There are a number of other lines, all with what seems to be identical issues.
When inspecting the actual characters behind the displayed characters, they seem to be 0xC3 & 0xAD, which, according to what I can find are intended to represent a UTF-8 character which is an i with accent aigu?.
PS: output from Exiftool version(s) 11.44 & 11.90
Any thoughts ?

StarGeek

See FAQ #18.

Windows command line sucks when it comes to non-ascii characters.  As a quick fix, I ususally add the -L (latin) option to get the correct output
C:\>exiftool -g1 -a -s -by-line -L y:\!temp\USSRONALDREAGANgoodshot.jpg
---- IPTC ----
By-line                         : Photographer's Mate 3rd Class (A
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

ScannerBoy

Thank you very much for that information, particularly for the FAQ link. As a casual, once in while user, this is the first time I had run into this sort of issue.
The only question still on my mind, is that the output I quoted came from a redirect to a file, which I then opened in Notepad++, which is set to UTF-8 without BOM.

StarGeek

I think the character encoding overall is just messed up for the data in that image.  I'm pretty sure that the data is supposed to be
Photographer's Mate 3rd Class (A
with a fancy/smart quote.  This is the result I get in the IPTC:By-line when I change the code page to 65001.  But the related tags of EXIF:Artist and XMP:Creator are still messed up
---- IFD0 ----
Artist                          : Photographerís Mate 3rd Class (A
---- IPTC ----
By-line                         : Photographer's Mate 3rd Class (A
---- XMP-dc ----
Creator                         : Photographerís Mate 3rd Class (A


When I redirect to a file, Notepad++, which is usually really good with character encoding, can't display any of these correctly.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

ScannerBoy

That fancy i with accent is what I get when I look up the UTF-8 character sequence for 0xC3 0xAF at
https://www.utf8-chartable.de/ , which corresponds to the claimed IPTC character set for UTF-8, but it does not display what one would expect  ::)