DOpus vs. ExifTool long comments

Started by mazeckenrode, March 26, 2021, 03:01:33 PM

Previous topic - Next topic

mazeckenrode

Me again.  :-)

This issue is somewhat similar to one or more that I've brought up before, but I don't believe I've brought up this specific one before, or can't find it if I have. I may also bring it up in the Directory Opus forum, in case anything useful shakes loose, but I'll start here.

As I've stated in previous posts, I utilize both DOpus and ExifTool extensively in dealing with image metadata, mostly dealing with images in PNG format, and a significant portion of my relevant work is spent on scanned pages from physical publications such as books and magazines, paper documents, and extracted/converted page images from PDFs. Adding unique and useful metadata to (DOpus field names) DESCRIPTION, SUBJECT, TITLE, COMMENT and TAGS (aka KEYWORDS) (all of which actually go to various EXIF, XMP and IPTC tags) has always been a challenge. My use of ExifTool has been instrumental in streamlining the process, but I also want the commonly-displayed metadata fields (mainly COMMENT and DATE TAKEN, for my purposes) to be displayable by DOpus via mouse-hover tooltips, and previously found that it's necessary to add those fields using DOpus prior to any subsequent manipulations by ExifTool, because the two tools don't add new metadata in the same location within PNG files, and DOpus won't always display it when created by ExifTool, but ExifTool is capable of editing it whereever it is found.

For page images derived from a 13-page PDF document, my filenames might look like:

2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 01.png
2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 02.png
[...]
2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 13.png


I use DOpus to write the same value to what it calls DESCRIPTION, SUBJECT, TITLE and COMMENT in all 13 files, such as:

Generic Company invoice 1234567890, account ABCDEFGHI-00001, 15 Dec 2020, total due $0.01, due date 5 Jan 2021, p 1/13; File as downloaded: "General_bill_December_15_2020.pdf" (296,175) [13 pp, 4347 w, 20097/23854 ch, 590 l]; Source: <https://www.general.com/gw/bill/docs/getpdf/gndoc?docName=General-Bill-12.15.2020&docId=YNAhNz8sXlzYk3n3dHidUX8hWkmiYZ5R9SbXOkGvcDcfsKLKtI22MkilMpEIdbItYozvYAGlzR0nmgg3Tdu6ZsAL1hxvnosmFcGx1sOSSd3fivVEkSQh2xQOPlDhouAU9yDpaJkhXGvV3vgjKBZWcB6rGbsAo6s6Uo72YGK2tDS8FbwP0PCQaYuknwWo0>

Then DOpus again to add TAGS (aka KEYWORDS), example:

General; Company; PDF; Document; Screenshot; 2020-12-15; 2020; December; Bill; Invoice; $0.01; Due_2021-01-05; Due_2021; Due_January; Account_ABCDEFGHI-00001; Invoice_1234567890; Page_1

Ultimately wanting each page image to have metadata accurately reflecting its own unique page number, I'm using the fairly complicated ExifTool command, created with much help from you guys last summer (thanks again), that gets the page number from the filename (digits following - at the end for this example, though could be followed by more text in some cases) and uses it to adjust p 1/13 to the resepectively appropriate number in DESCRIPTION, SUBJECT, TITLE and COMMENT in all files, and likewise bump Page_1 in TAGS/KEYWORDS.

The problem is, in some but not all cases, after I've run the ExifTool command to adjust the page numbers, DOpus won't display the updated COMMENT, neither via mouse-hover tooltip nor via Set Metadata dialog / Metadata pane. The length of the data string appears to be a factor — right now, I've got a batch of just-updated files with 928/929-character strings in DESCRIPTION, SUBJECT, TITLE and COMMENT, and won't display COMMENT — but much shorter updated strings have continued to be displayed (the threshold seems to be 512/513 characters). ExifTool confirms that the metadata is, in fact, there in the COMMENT, and that it's identical to the strings in DESCRIPTION/SUBJECT/TITLE for each file, and at most only one character off from the original (and correctly displayed) string written before the ExifTool manipulation. Furthermore, I can manually copy the string from any of the latter fields, paste it to COMMENT, and then it will be displayed. But the copied/pasted/displayed string is exactly the same as the MIA one, as far as any method I can use to examine it goes.

Is there some way to examine the before-and-after files in the attached 7-zip for any fundamental differences in the storing of XMP:UserComment and/or EXIF:XPComment from one to the next that could possibly contribute to an understanding of this isse?

Attached: "2021-03-26 14;50;00 - MAZE - MIA Comment Test.7z" (8,385)

Contents:

"2021-03-26 14;50;00 - MAZE - MIA Comment Test\"
  "0 No Meta\"
    "2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 07.png" (197) [1 x 1 x 1]
  "1 DOpus\"
    "2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 07.png" (8,179) [1 x 1 x 1]
    "2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 07.png.json" (6,643)
  "2 DOpus+ExifTool\"
    "2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 07.png" (11,700) [1 x 1 x 1]
    "2020-12-15 23;59;59 - General - xFGHI-00001 Invoice 1234567890 - 07.png.json" (7,734)


Phil Harvey

Quote from: mazeckenrode on March 26, 2021, 03:01:33 PM
DOpus won't always display it when created by ExifTool

Before I read any further... what version of ExifTool are you using?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mazeckenrode

Ah, yes, sorry... currently ExifTool v12.17, but I've encountered this issue maybe half a dozen times since last summer. All involved much smaller batches of PNGs, which didn't quite meet my threshold of bothersomeness to bring it up, until now.

Phil Harvey

I see this  problem is years old.

Use the exiftool -v3 option to see the details.  My first post in the other thread essentially explains the problem.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mazeckenrode

Quote from: Phil Harvey on March 26, 2021, 09:46:32 PM
I see this  problem is years old.
[...]
My first post in the other thread essentially explains the problem.

Thanks, but I'm missing something here. I don't understand the connection with the other thread, nor that thread's explanation as it pertains to the issue raised in this thread.

The 2018 thread deals with metadata imported/created by ExifTool into an otherwise unpopulated PNG being invisible to Directory Opus (explained by ExifTool creating EXIF only in eXIf and not in zTXt where DOpus expects it.

This [current] thread deals with certain metadata, first created by DOpus, subsequently updated by ExifTool, being invisible to DOpus, but only if its length is greater than 512 characters.

Quote
Use the exiftool -v3 option to see the details.

I have done that now, and apologize for having forgotten about using -v3 since the 2018 thread. Lacking your expertise, my understanding of the output is limited, but what I think I'm seeing is that when DOpus is used to create EXIF:XPComment in PNG files, it creates it in zTXt, and when ExifTool is then used to update that existing field as previously created by DOpus, the field's data then exists in both zTXt and eXIf. This appears to be the case whether or not EXIF:XPComment contains a string greater than 512 characters in length. But in my tests, only the ones with 513+ characters that were updated by ExifTool are invisible to DOpus.

Another mystery is why other EXIF fields containing the same data didn't become invisible to DOpus, but that's for the DOpus forum.