UserDefined Parsing to List of Structs?

Started by LegacyDev, May 05, 2023, 05:54:21 PM

Previous topic - Next topic

LegacyDev

Hello,

I am extending the LNK Module to fix errors and read the TargetIDList structure from shell link files.
I currently have everything I need being parsed correctly from the LNK files, but I am struggling to figure out the correct UserDefined syntax in order to produce output similar to the following sample XMP output for an image, formatted as JSON:
    ...
    "History": [{
      "Action": "saved",
      "Changed": "/metadata",
      "InstanceID": "xmp.iid:aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
      "SoftwareAgent": "Adobe Photoshop Camera Raw 14.5",
      "When": "2022:08:30 09:42:16-05:00"
    },{
      "Action": "saved",
      "Changed": "/metadata",
      "InstanceID": "xmp.iid:aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
      "SoftwareAgent": "Adobe Photoshop Camera Raw 14.5 (Windows)",
      "When": "2023:03:31 10:52:58-05:00"
    }]
    ...

Below is the command line I am using on the LNK file:
exiftool -config TEST.config -g -a -struct -json -all:all "Shortcut1.lnk" -v

And here is the abbreviated output:
  ExifToolVersion = 12.30
  FileName = Shortcut1.lnk
  Directory = .
  ...
  FileType = LNK
  FileTypeExtension = LNK
  MIMEType = application/octet-stream
  + [BinaryData directory, 76 bytes]
  | Flags = 524443
  | FileAttributes = 32
  ...
  | IconIndex = 0
  | RunWindow = 1
  | HotKey = 0
  TargetIDList (SubDirectory) -->
  + [IDList directory, 655 bytes]
  | TargetIDList_ROOT (SubDirectory) -->
  | + [BinaryData directory, 20 bytes]
  | | ItemType = 31
  | | SortIndex = 80
  | | CLSID = 224 79 208 32 234 58 105 16 162 216 8 0 43 48 48 157
  | TargetIDList_VOLUME (SubDirectory) -->
  | + [BinaryData directory, 25 bytes]
  | | ItemType = 47
  | | Name = T:\
  | TargetIDList_FILE (SubDirectory) -->
  | + [BinaryData directory, 108 bytes]
  | | ItemType = 49
  | | FileSize = 4096
  | | PrimaryName = AAAAAAAAAAAAAA
  | TargetIDList_FILE (SubDirectory) -->
  | + [BinaryData directory, 90 bytes]
  | | ItemType = 49
  | | FileSize = 4096
  | | PrimaryName = BBBBBBBB
  | TargetIDList_FILE (SubDirectory) -->
  | + [BinaryData directory, 90 bytes]
  | | ItemType = 49
  | | FileSize = 0
  | | PrimaryName = CCCCCCCC
  | TargetIDList_FILE (SubDirectory) -->
  | + [BinaryData directory, 150 bytes]
  | | ItemType = 49
  | | FileSize = 327680
  | | PrimaryName = DDDDDDDDDDDDDDDDDDDDDDDDDDDD
  | TargetIDList_FILE (SubDirectory) -->
  | + [BinaryData directory, 170 bytes]
  | | ItemType = 50
  | | FileSize = 7633639
  | | PrimaryName = EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE.TXT
  ...
[{
  "SourceFile": "Shortcut1.lnk",
  "ExifTool": {
    "ExifToolVersion": 12.30
  },
  "File": {
    "FileName": "Shortcut1.lnk",
    "Directory": ".",
    ...
    "FileType": "LNK",
    "FileTypeExtension": "lnk",
    "MIMEType": "application/octet-stream"
  },
  "LNK": {
    "Flags": "IDList, LinkInfo, RelativePath, WorkingDir, Unicode, TargetMetadat
a",
    "FileAttributes": "Archive",
    ...
    "IconIndex": "(none)",
    "RunWindow": "Normal",
    "HotKey": "(none)",
    ...
  },
  "LNKExt": {
    "SortIndex": "0x50",
    "CLSID": "224 79 208 32 234 58 105 16 162 216 8 0 43 48 48 157",
    "Name": "T:\\"
    "ItemType": "0x32",
    "FileSize": 7633639,
    "PrimaryName": "EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE.TXT"
  }
}]

As you can see from the verbose output, all of the ItemID members of the IDList are parsed correctly from the LNK file.  However, the members are pulled out of the individual items and placed in the main "LNKExt" group after parsing.  Is there a way to group them similar to the XMP sample output provided at the top?  I have tried all sorts of different things using Structs/SubDirectories/etc, but nothing seems to produce the output I'm looking for.

Here is the start of my TEST.config file:
%Image::ExifTool::UserDefined = (
'Image::ExifTool::LNK::Main' => {
0x10000 => {
Name => 'TargetIDList',
SubDirectory => { TagTable => 'Image::ExifTool::UserDefined::LNKExt::TargetIDList' },
},
},
)

Below is %Image::ExifTool::UserDefined::LNKExt::TargetIDList, which is a modified copy of %Image::ExifTool::LNK::ItemID:
%Image::ExifTool::UserDefined::LNKExt::TargetIDList = (
GROUPS => { 0 => 'LNKExt', 1 => 'LNKExt', 2 => 'Other' },
PROCESS_PROC => \&ProcessItemID,
0x1 => { # ROOT
Name => 'TargetIDList_ROOT',
SubDirectory => { TagTable => 'Image::ExifTool::UserDefined::LNKExt::TargetIDList::Item_ROOT' },
},
0x2 => { # VOLUME
Name => 'TargetIDList_VOLUME',
SubDirectory => { TagTable => 'Image::ExifTool::UserDefined::LNKExt::TargetIDList::Item_VOLUME' },
},
0x3 => { # FILE
Name => 'TargetIDList_FILE',
SubDirectory => { TagTable => 'Image::ExifTool::UserDefined::LNKExt::TargetIDList::Item_FILE' },
},
);
I'm not sure how I should set up %Image::ExifTool::UserDefined::LNKExt::TargetIDList or ProcessItemID to produce the list of items example at the top.  Looking through the modules for various formats, it appears that all modules that output in that format are XMP-based.  Is there any way to produce this output for non-XMP/custom parsed data?

For clarity, here is an example of how the output would ideally look after parsing the LNK TargetIDList:
  ...
  "LNKExt": {
    "TargetIDList": [{
      "ItemType": "0x1F",
      "SortIndex": "0x50",
      "CLSID": "224 79 208 32 234 58 105 16 162 216 8 0 43 48 48 157"
    },{
      "ItemType": "0x2F",
      "Name": "T:\\"
    },{
      "ItemType": "0x2F",
      "FileSize": 4096,
      "PrimaryName": "AAAAAAAAAAAAAA"
    },{
      "ItemType": "0x2F",
      "FileSize": 4096,
      "PrimaryName": "BBBBBBBB"
    },{
      "ItemType": "0x2F",
      "FileSize": 0,
      "PrimaryName": "CCCCCCCC"
    },{
      "ItemType": "0x2F",
      "FileSize": 327680,
      "PrimaryName": "DDDDDDDDDDDDDDDDDDDDDDDDDDDD"
    },{
      "ItemType": "0x32",
      "FileSize": 7633639,
      "PrimaryName": "EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE.TXT"
    }]
  }
  ...

Once I get everything working properly, I will gladly share the updated LNK module for inclusion in ExifTool.

Phil Harvey

#1
Thanks.  It will take me at least a few days to get the time to look at this in detail and answer your questions.

- Phil

Edit:  (or a few weeks -- the nice weather has arrived!)
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Sorry for the long delay in responding.

What you are asking is possible, but would require writing dedicated code to be able to parse the input into a structured format.  Currently XMP is the only metadata type with built-in structure support.

I see why you might want to do this, but it will be a bit of work.

I should be able to help more if you could provide a sample file containing information such as in your example.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).