Group, Tag Name, Tag Id

Started by Mac2, April 21, 2022, 05:14:32 AM

Previous topic - Next topic

Mac2

In the past, I used Group Name\Tag Name\Tag Id to uniquely identify a tag value in my database.
This was my understanding at the time (many years ago) when I've implemented support for ExifTool in my application.

But I understand now that only the group name and tag name are needed, and that the tag id is something ExifTool considers internal and can change at any time. Is this correct?

I'm currently developing a new tool, and identifying tags only by Group Name\Tag Name would be simpler. And simpler is generally better  ;)

Phil Harvey

This is complex due to the MakerNote situation.  There are many locations for some tags in the maker notes, and different tags may have the same group, name and even (in theory) ID.  You would have to add the table name to that list to be sure you have a unique tag.  In general, the ID won't change for any "real" tag because the ID's are usually stored in the file.  The only exception is for tags where the ID isn't stored, like indices into a binary data block -- these may change if ExifTool changes the way it reads the binary data, but again, this generally only happens in MakerNotes.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

I was maybe not precise with my wording regarding group / table name.

I use the name of the <table> XMP output as the group name, and the id and name attributes of the <tag> output. I import this into the database when I ask ExifTool to -list tables (groups) and keys. This is the core. Using this info, I can store data in the database when I extract data from files later. When ExifTool delivers data for a "new" tag not included in -listx output, I create a new database entry using the table name, tag id and tag name from the XML output. I call that dynamic tags and I've seen these e.g. for maker notes, QuickTime metadata etc.

Anyway, this produces tag keys like XMP::dc\rights\Rights or Exif::Main\37510\UserComment.
When I look at maker notes, ExifTool emits the data with tag keys like Nikon::ShotInfoD6\50372\FlashMasterOutput, which looks the same.

When I interpret what you say correctly, the shortened key Nikon::ShotInfoD6\FlashMasterOutput is maybe not unique (at least not for standard EXIF/XMP/IPTC...) tags?
It may appear with different ids? Or there is a fourth attribute (table) I should use when producing a unique key?

Phil Harvey

To do this properly, you should be using "table name", "tag id" and "tag index" from the -listx output to match with tags in the -X -t output.

Note that none of these are guaranteed to be consistent between releases.  The only guarantee is the -listx will match -X for a given ExifTool version.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

What I have used over the past years is (blue):

<table name='XMP::dc' g0='XMP' g1='XMP-dc' g2='Other'>
<tag id='creator' name='Creator' type='string' writable='true' flags='List,Seq' g2='Author'>


My tag keys are thus Table Name\Tag Id\Tag Name or Table Name\Tag Id\Tag Name\Index, when the index attribute is included in -listX or -X.
My software assumes Index = 0 by default.

This worked well.
For my new tool I've just wondered if keys of the form Table Name\Tag Name would be sufficient. Apparently, it is not.

The challenge when doing anything persistent with ExifTool data (e.g. store metadata extracted for years in a database) is to have a unique id / key for each tag.
I need to find and use metadata extracted with ExifTool 5 years ago and yesterday, to make it useful for my users.

Over the years you have changed tag ids and names(?) occasionally. Which is of course your privilege and always made good sense.

Such changes require me to include an upgrade/migration step for my application, in which the existing data in the database is mapped from the "old" to the "new" tag keys. There can only be one key per tag.
Sometimes, to my shame, I did not even notice that tag names / ids where changed (despite reading the release notes)  :-[ unless users noticed and reported it.
Nothing in the release notes or me not understanding it correctly.

I'm sure not only my software is affected by this, but many applications which use ExifTool to extract metadata and then store this metadata somewhere.
Your table, tag id, tag name, index schema is ideal for creating persistence keys for database storage. And also for the user side, since keys like XMP::dc\creator\Creator\0 are easy to use for people.

I don't know if there is a better solution to persist tag data extracted with ExifTool for a decade or three.

Or if you could provide a migration history log which lists groups / tags which have been changed over time (from => to). Where possible, e.g. not for on-the-fly tag creation.
So a software relying on table\tag id\tag key\index can automatically perform migrations when needed.

Then, afaik these changes only affected rarely used or fringe tags, not important EXIF/IPTC/XMP/GPS tags, which contain the most useful tags for users.

Phil Harvey

You are doing it the right way.

IPTC, XMP and GPS tags are very stable, and your scheme should work well into the future for these.

This is also true for most useful EXIF tags.  The exceptions shouldn't contain very useful information (eg. StripOffsets).

MakerNotes are a whole other can of worms, and things change as we learn more about the decoding of these.  Providing detailed migration information for these would be a fair bit of work, and not feasible in many situations.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

Thanks, Phil, for taking the time to reply.

Quote from: Phil Harvey on April 21, 2022, 01:43:22 PM
You are doing it the right way.

Excellent. I'll then keep the full keys with table\id\name{\index}.

Quote from: Phil Harvey on April 21, 2022, 01:43:22 PM
MakerNotes are a whole other can of worms, and things change as we learn more about the decoding of these.  Providing detailed migration information for these would be a fair bit of work, and not feasible in many situations.

Agreed. It is how it is.
Maker notes are not used by many users. My software by default hides them and does not import data for most of them. This reduces user confusion and keeps the database small.

However, there are sometimes good reasons to directly access maker notes - because a camera or device vendor does not fill the corresponding EXIF tag or the maker notes hold 'better' data.
I fear this will become even worse for video files.

I usually recommend to copy the maker note data they're interested in into a more stable tag, e.g., one of the XMP text fields or an Attribute (a concept of my software).
Users directly working with maker notes should be aware of the fact that maker notes are both useful and potentially evil at the same time. And that their tag keys may change over time. ;D