ExifTool Forum

ExifTool => Bug Reports / Feature Requests => Topic started by: bugmenot on December 22, 2021, 12:22:35 AM

Title: Keywords are not written or read properly
Post by: bugmenot on December 22, 2021, 12:22:35 AM
I'm using exiftool 12.38 on Windows 10 and am noticing some inconsistencies with JPG Keywords.  I've tried tagging images using exiftool and the Windows GUI (adding them in this dialog (https://i.imgur.com/HAJcyQl.png)).  The results are inconsistent.  My end goal is to read the Keywords in a PHP application, and at this point I've not ruled out that PHP is working correctly either.

Here's a test script that can demonstrate the problem:

<?php
// Create test image
$im = imagecreate(400, 300);
imagejpeg($im, 'test.jpg');
imagedestroy($im);

// Add tag
echo `exiftool -overwrite_original -keywords+="delete" test.jpg`;

// Load tags with PHP X extension
ini_set('exif.decode_unicode_motorola', 'UCS-2LE');
$data = exif_read_data('test.jpg');
if(
array_key_exists('Keywords', $data)) {
var_dump($data['Keywords']);
} else {
echo "No keywords found on test.jpg!\n";
}

// Read tags with exiftool
echo `exiftool -keywords test.jpg`;

if(
file_exists('windows.jpg')) {
$data = exif_read_data('windows.jpg');
if(array_key_exists('Keywords', $data)) {
var_dump($data['Keywords']);
} else {
echo "No keywords found on windows.jpg!\n";
}

// Read tags with exiftool
var_dump(`exiftool -keywords windows.jpg`);
}


Explanation of what's going on:

windows.jpg is a file I created in Paint.  Then I added a delete tag with Windows Explorer (https://i.imgur.com/HAJcyQl.png).

Here is the output on my system:
    1 image files updated
No keywords found on test.jpg!
Keywords                        : delete
string(6) "delete"
Warning: [minor] Fixed incorrect URI for xmlns:MicrosoftPhoto - windows.jpg
NULL


When PHP generates a JPG and uses exiftool to add a keyword, PHP can't find the keyword but exiftool can read it.
When Windows writes the keywords, exiftool can't find the keywords but PHP's exif function can.



FWIW, the Windows GUI always shows all tags, regardless of which application added them.  Windows will show tags that are "invisible" to exiftool and tags that are not visible to PHP's exif functions.
Title: Re: Keywords are not written or read properly
Post by: StarGeek on December 22, 2021, 01:02:32 AM
Looking over the php.net entry on exif_read_data (https://www.php.net/manual/en/function.exif-read-data.php), I don't see anything that indicates that exif_read_data can read any metadata other than EXIF.  Keywords is an IPTC tag.  I'm guessing what you need to use is iptcparse (https://www.php.net/manual/en/function.iptcparse.php).

Use the command in FAQ #3 (https://exiftool.org/faq.html#Q3) to find where the tags you're writing are actually located.
Title: Re: Keywords are not written or read properly
Post by: bugmenot on December 22, 2021, 04:33:02 AM
Forgive my ignorance, but what's the difference between Keywords and XPKeywords?

Here's the actual source of PHP's parser:
https://github.com/php/php-src/blob/master/ext/exif/exif.c#L3416-L3422
Title: Re: Keywords are not written or read properly
Post by: StarGeek on December 22, 2021, 11:42:43 AM
Keywords is an IPTC tag (https://exiftool.org/TagNames/IPTC.html) which is part of the older IPTC IIM standard.  It has its own location in a file.  In jpegs it is in the APP13 block.  It is a list type tag where ever entry is completely separate from the others which, I'm guessing, the iptcparse function would return as an array (verified (https://www.php.net/manual/en/function.iptcparse.php#Vu45360)).  Even though it is an older standard, it still has widespread support among programs.

The XPKeywords is a Microsoft tag introduced with Windows XP.  It is located in the EXIF group (https://exiftool.org/TagNames/EXIF.html).  It is a semicolon separated string tag, which you would have to parse out the entries yourself.  It has almost no support in programs beyond the Windows OS.
Title: Re: Keywords are not written or read properly
Post by: bugmenot on December 22, 2021, 11:18:34 PM
In that case, this seems to be a problem of competing standards (https://xkcd.com/927/) so I'll call this not a bug.  This code will read both types, and later the two lists can be merged:

<?php
$data
= @exif_read_data($filename);
var_dump($data['Keywords'] ?? ''); //semicolon-delimited string
getimagesize($filename, $info);
print_r(iptcparse($info['APP13'])['2#025']); //array


Thanks for the help StarGeek