How to uniquely identify tag ID

Started by mattburns, January 30, 2017, 11:43:28 AM

Previous topic - Next topic

mattburns

Is there a way to format the output so that I can generate a tag ID that is globally unique? The Tag ID is only unique if you know the folder/group it's in.

For example, if I use :

exiftool -j -U -G -H -a image

Then I get things like:


  ...
  "EXIF:Make": {
    "id": "0x010f",
    "val": "NIKON"
  },
  ...
  "EXIF:GPSTimeStamp": {
    "id": "0x0007",
    "val": "00:00:00"
  },
  ...


In the example above, 0x0007 is only useful if you know I'm talking about the GPS tags. I can use the output above to get the group name and tag name, but I can't rely on that to be consistent as unknown tags are identified. For example:


  "MakerNotes:Nikon_UnknownInfo_0x0003": {
    "id": "0x0003",
    "val": 0
  },


This may change in newer versions of ExifTool.

Ideally, I'd like a way to output the "fullID" or "fullPath" of the tag which effectively includes the address, or some kind of namespace. Something like this:


  "MakerNotes:Nikon_UnknownInfo_0x0003": {
    "fullID": "0x8769:0x927c:0x002c:0x0003",
    "fullPath": "ExifOffset:MakerNoteNikon2:UnknownInfo:0x0003",
    "id": "0x0003",
    "val": 0
  },


Any ideas for how I could achieve anything like that?

Phil Harvey

#1
A tag is uniquely identified in the -X -t output by a combination of the "et:id", "et:table" and "et:index" properties.

The -t feature doesn't yet work with the JSON output.

- Phil

Edit:  I'll add this feature to the -j and -php outputs for ExifTool 10.41
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mattburns

Thanks Phil,

That's very helpful.

However, if I have the output:


<Nikon:Nikon_UnknownInfo_0x0003 et:id='0x0003' et:table='Nikon::UnknownInfo'>0</Nikon:Nikon_UnknownInfo_0x0003>


Then I can concatenate et:table and et:id to make an ID, but wouldn't that be prone to changing in subsequent versions of ExifTool? eg:


  Nikon::UnknownInfo::0x0003


"UnknownInfo" could be renamed later?

Phil Harvey

Ouch.  You want something that is consistent across ExifTool versions.  Not possible.  Tags may move into different tables as more information is decoded, which will change all of the coordinates for the tag (id, table and index).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

mattburns

OK, fair enough, thanks Phil. I'll just prefix my IDs with the ExifTool version number, that'll do for now.

Crazy idea: Would it be worth me attempting to write a script to try to do a reverse lookup of tag names to the tag IDs from the "-listx" xml database? Or is that also impossible?

Thanks again for your time :)

Phil Harvey

The design of the -X output is that you can use this information to find the exact tag in the -listx database.

I don't know exactly what you mean by a reverse lookup.  Maybe this is what you meant?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

rur

Although very late to the party, this topic seems to be the best place to present an alternative to the address the "reverse lookup" approach described by user mattburns. It also address missing repeating values not revealed by the -a option

I came across an issue using the -csv option on a particular mp4 file
exiftool -a -HandlerClass -HandlerType -HandlerDescription  test.mp4
produces this result:
Handler Class                  : Media Handler
Handler Class                  : Data Handler
Handler Type                    : Video Track
Handler Type                    : URL
Handler Description            : VideoHandler
Handler Description            : DataHandler

This produces two "Handler" struct results, but no clue as to their context

exiftool -a -HandlerClass -HandlerType -HandlerDescription -csv test.mp4
produces this result:
SourceFile,HandlerClass,HandlerType,HandlerDescription
test.mp4,Data Handler,URL,DataHandler

Notice that only one "Handler" struct result is produced

exiftool -a -G0:1:2:3:4:5:6:7:8:9 -HandlerClass -HandlerType -HandlerDescription test.mp4
produces this result:
[QuickTime:Track1:Video:Main:Copy1:MOV-Movie-Track-Media-Handler:ID-4] Handler Class: Media Handler
[QuickTime:Track1:Video:Main:MOV-Movie-Track-Media-MediaInfo-Handler:ID-4] Handler Class: Data Handler
[QuickTime:Track1:Video:Main:Copy1:MOV-Movie-Track-Media-Handler:ID-8] Handler Type: Video Track
[QuickTime:Track1:Video:Main:MOV-Movie-Track-Media-MediaInfo-Handler:ID-8] Handler Type: URL
[QuickTime:Track1:Video:Main:Copy1:MOV-Movie-Track-Media-Handler:ID-24] Handler Description: VideoHandler
[QuickTime:Track1:Video:Main:MOV-Movie-Track-Media-MediaInfo-Handler:ID-24] Handler Description: DataHandler


This provides the hierarchical navigation path and the -csv option uses the same 6 column headers to correctly produce the desired (i.e. complete) results.

As you can see, the tagID is duplicated and a more complete path is required to make it unique (in this case -G0:1:2:3:4 would suffice)

So I have a unique identifier using this approach. However, I was expecting/hoping for the multiple values to be included in a "comma, separated" list for the plain HandlerClass,HandlerType,HandlerDescription columns. Is there a way to accomplish producing such a list in as the column value?

Further, I do not grok the change in syntax with "MOV-Movie-Track-Media-MediaInfo-Handler" and "Copy1:MOV-Movie-Track-Media-Handler". It appears to be a simple replacement of ":" with "-" to squeeze deep structures together (with some redundancy). Is there any reference that explains this format?

Finally, perhaps the -j json output could be produced with some additional switches, such as -jh for hierarchical json (instead of a vector on scalar structs. Perhaps a -jp option to include an additional attribute, e.g. et:path to render the complete json hierarchical path to the value. The XML output could mimic this approach as well.

Phil Harvey

Quote from: rur on February 20, 2025, 10:50:44 AMI was expecting/hoping for the multiple values to be included in a "comma, separated" list for the plain HandlerClass,HandlerType,HandlerDescription columns. Is there a way to accomplish producing such a list in as the column value?

No.  The best you can do here is to use -G4 -a and the duplicates will have separate columns.

QuoteFurther, I do not grok the change in syntax with "MOV-Movie-Track-Media-MediaInfo-Handler" and "Copy1:MOV-Movie-Track-Media-Handler". It appears to be a simple replacement of ":" with "-" to squeeze deep structures together (with some redundancy). Is there any reference that explains this format?

The group names are described in various places (here for example).  The family 5 group names are composed of a hyphen-separated listing of the hierarchy of where the tag was stored.  This isn't documented further because it is sort of an experimental feature.

Quoteinally, perhaps the -j json output could be produced with some additional switches, such as -jh for hierarchical json (instead of a vector on scalar structs. Perhaps a -jp option to include an additional attribute, e.g. et:path to render the complete json hierarchical path to the value. The XML output could mimic this approach as well.

Various other options may be combined with -j to achieve results similar to what you are requesting.  See the -j option documentation for details.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).