please allow users to get and set tags by ID instead of just by name

Started by dae65, July 24, 2020, 02:01:11 AM

Previous topic - Next topic

dae65

Dear Phil,

As a follow-up to the other thread, I've written a wrapper SetNewValueByID() method and a FindTagInfoByID function which together enable SetNewValue() to set tag values by ID, instead of just by name. The wrapper simply ensures that a group is provided, and that a new ByID option is true. In other words, these would be equivalent:


$et->SetNewValue     ( '©alb', 'My Album', Group => 'ItemList', ByID => 1 );
$et->SetNewValueByID ( '©alb', 'My Album', Group => 'ItemList' );


I was wondering if you'd be able to have a look at it (attached). This seems to work as far as the ItemList and UserData groups are concerned, and I suppose it will work for any other group where tag IDs are unique in a table. I understand a solution to some other tables isn't going to be as straightforward. Maybe issuing for the time being a warning against using it with such groups?

The rationale behind this is to allow users more easily to determine what underlying tags will actually be written to, or read from, a file. Where tag names are ambiguous in a single table (e.g., 'Title' in QuickTime ItemList), users will be able pick a particular tag ID (titl or ©nam?) if it matters which one will be used at the end. The other solution (here) works too, but I think this one will be seen as less complicated from a user's point of view.

I hope you like the idea. Please find a diff file attached. I tried to meddle as little as possible with your code. Image::ExifTool is a great tool. Thank you for the tremendous amount of work you have put into it.

Best.

Phil Harvey

I don't like this solution because it doesn't work for the command line.  But something has to be done here so I will think about this.  I'm thinking along the lines of making a new group which is the tag ID, and allowing you to specify that group when writing.  This would work fine except that the tagID's don't follow the ExifTool convention for group names (/^[A-Z][-_A-Za-z0-9]+$/), but I think I can work around this, maybe by converting other characters to hex and adding a leading "ID_" or something to the group name.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dae65

Quote from: Phil Harvey on July 25, 2020, 06:49:22 AM
I don't like this solution because it doesn't work for the command line.
Not yet... ;) This solely depends on how command-line arguments are parsed. Preferably, we could also add some --set-tag-by-id <ID> <VALUE> command-line action/option, and thereby preserve -TAG= as it currently is.

Quote from: Phil Harvey on July 25, 2020, 06:49:22 AM
But something has to be done here so I will think about this.
Thanks. :) I really appreciate it.

Quote from: Phil Harvey on July 25, 2020, 06:49:22 AM
I'm thinking along the lines of making a new group which is the tag ID, and allowing you to specify that group when writing.
I'm afraid tag IDs would thereby become ambiguous, just like tag names already are, across underlying tables: ©alb, for instance, appears under both udta and ilst atoms (or in both UserData and ItemList groups). Either such a new table would have to replicate that on yet another level, or users would be required to provide a group or subgroup, just as with our provisional SetNewValueByID above.

Honestly, I guess the idea I suggested above would be easier to implement, as it already makes use of the existing tables from Image::ExifTool::TagLookup, and of the existing machinery of SetNewValue. All we would need to do is write a few additional "byID" wrapper methods.

Quote from: Phil Harvey on July 25, 2020, 06:49:22 AM
This would work fine except that the tagID's don't follow the ExifTool convention for group names (/^[A-Z][-_A-Za-z0-9]+$/), but I think I can work around this, maybe by converting other characters to hex and adding a leading "ID_" or something to the group name.
I respectfully think this is not a good idea. At the very least, this would duplicate tables. Moreover, I don't see the point of doing this, since we already have a run-time, non-hard-coded, way to lookup tags by ID.

Thanks a lot for your time, Phil.
Best.

Phil Harvey

Quote from: dae65 on July 25, 2020, 09:11:53 AM
Preferably, we could also add some --set-tag-by-id <ID> <VALUE> command-line action/option, and thereby preserve -TAG= as it currently is.

This has 2 problems.

1. A new option is required (and there are already too many of these).  But admittedly this isn't a big problem.

2. Tag ID's in general can not be represented in ASCII, so using them directly on the command line is problematic.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dae65

Right. So maybe, yes, we could use an alias in a group for an ID in the same group. Such an alias, I insist, would need to be really just an alias, and not a tag name in another group. Otherwise we might get ambiguous aliases, which would defeat the purpose.

Also, I'd like to suggest that aliases should be meant to be used only, or primarily, on the command line, and that the API should be able to work with IDs directly as such. While the command-line argument vector is being parsed, the API or the exiftool command-line application itself would translate aliases into IDs, and pass IDs to the relevant methods, not aliases, so that other applications using Image::ExifTool wouldn't need to deal with aliases if they already had the IDs.

Thank you.

Phil Harvey

Interesting idea.

But I'm not sure you are grasping the flexibility of the ExifTool groups.  What I was proposing would look something like this in the API:

$et->SetNewValue( 'ItemList:ID_a9alb:Album' => 'some value' );

And using a similar approach on the command line.  So no new options are necessary for the command line or SetNewValue function.

I hear your objection about having to translate 'ID_a9alb' (or something) to '©alb' internally (and visa versa for your app), but building this lookup table would be fast and easy, and done on demand.  Also, for the app, I could allow "ID_©alb", so all you would have to do is add "ID_" to the start, ie)

$et->SetNewValue( 'ItemList:ID_©alb:Album' => 'some value' );

I agree that using the actual ID's would be simpler, and could work via the API as you mention, but as you also point out an alias similar to above would be required anyway for the command line.

Disclaimer: I haven't coded all this so I don't know how easy it would be, but I don't forsee any problems.  Edit: Problem. Tag ID's are case sensitive, but ExifTool group names are not. :(  Edit2: Not so much of a problem after all.  I can just make these case sensitive.  Not very consistent though.  Edit3: I've accomplished this with 30 added/changed lines of code, so it was even a bit simpler than your mod with 40 added/changed lines.  I think I'll include this in 12.02 for you to play with, but leave it undocumented for now until I can get some feedback from you. It turns out that no aliases were necessary because I only ever needed to convert from real Tag ID's to ASCII-friendly ID's, and not back again

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dae65

Quote from: Phil Harvey on July 27, 2020, 04:38:59 PM
ExifTool 12.02 is now available.
Quote from: Phil Harvey on July 27, 2020, 02:03:51 PM
I've accomplished this with 30 added/changed lines of code, so it was even a bit simpler than your mod with 40 added/changed lines.  I think I'll include this in 12.02 for you to play with, but leave it undocumented for now until I can get some feedback from you. It turns out that no aliases were necessary because I only ever needed to convert from real Tag ID's to ASCII-friendly ID's, and not back again
Thank you so much! :) I'll give feedback soon.

dae65

Some feedback about ExifTool 12.02:

Syntax

Edit: Let me think this over.




Unable to get values

GetValue is not working with IDs. Both:

$et->GetValue ( 'Album', { Group1 => 'ItemList:ID_©alb' } );
$et->GetValue ( 'Album', { Group1 => 'ItemList:ID_albm' } );

seem to behave like this, i.e., as if no ID was specified:

$et->GetValue ( 'Album', { Group1 => 'ItemList' } );

That is, it will return whatever 'Album' tag is found first, according to Preferred and Avoid rules. Neither is the command-line tool able to get values by ID. This prints nothing:

$ exiftool -ItemList:ID_albm:Album FILE



Hash

The ImageInfo() function still returns a hash with tag names as keys, and so users are unable to know what underlying tags the values were read from in the file. I understand ambiguous names are, to some extent, disambiguated by an index --- Album (1) --- but users can't tell whether 'Album' or 'Album (1)' corresponds to either ©alb or albm.  This is an issue because duplicate values will sooner or later be written to files.

For example, in a file where albm is set, but ©alb isn't, the following will actually copy albm to ©alb, and unknown to the user, the file will end up with 2 tags with the same content.


my $info = Image::ExifTool::ImageInfo ( $file );
my $et = new Image::ExifTool ( $file );
$et->SetNewValue ( 'ItemList:ID_©alb:Album', $info->{Album} );
$et->WriteInfo ( $file );


To prevent this from happening, please provide us with an ImageInfoByID() function that returns a hash with the following structure:


$info =
{
  ItemList =>
  {
    ©alb => 'My Album A',
    albm => 'My Album B',
  },
  UserData =>
  {
    ©alb => 'My Album D',
    albm => 'My Album E',
  }
};


I'd be happy to try to write (a draft of) one if you're willing to make it available.  :)

Thank you!

Phil Harvey

Quote from: dae65 on July 28, 2020, 12:55:38 PM

$et->GetValue ( 'Album', { Group1 => 'ItemList:ID_©alb' } );
$et->GetValue ( 'Album', { Group1 => 'ItemList:ID_albm' } );

This won't work because Group1 is not a SetNewValue option.  Try Group => 'ItemList:ID_a9alb' or the way I specified.  Using the copyright symbol is problematic, and will only work if your code uses uses the proper encoding.  I suggest using "\xa9" instead.

QuoteThe ImageInfo() function still returns a hash with tag names as keys, and so users are unable to know what underlying tags the values were read from in the file.

You get use $et->GetGroup($key, 7) to get the tag ID group name.  The complication is that you first have to set the ExifTool SaveIDGroup option so it generates this name for you -- I'll try to figure out a way to avoid this.

Quoteplease provide us with an ImageInfoByID() function that returns a hash

I really try to avoid adding dedicated functions like this.  The code will soon become a rats nest of functions tailored to the different ways that people want the information returned.  In the current scheme, there is nothing unique about the group names in family 7, and there is I don't think it is a good idea to add a function that treats them specially.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

dae65

GetValue()

Quote from: Phil Harvey on July 28, 2020, 09:03:41 PM
Try Group => 'ItemList:ID_a9alb' or the way I specified.

Thanks, Phil. GetValue is really not working with IDs. Neither is GetNewValue, as it seems. This well-behaved script prints 12 lines:


#!/usr/bin/perl -w
use Image::ExifTool;
my $file = shift;
my $et = new Image::ExifTool ( $file );

$et->SetNewValue ( "ItemList:ID_a9alb:Album", 'a' );
$et->SetNewValue ( "UserData:ID_a9alb:Album", 'b' );
$et->SetNewValue ( "ItemList:ID_albm:Album",  'c' );
$et->SetNewValue ( "UserData:ID_albm:Album",  'd' );

say STDOUT $et->GetNewValue ( "ItemList:ID_a9alb:Album" );
say STDOUT $et->GetNewValue ( "UserData:ID_a9alb:Album" );
say STDOUT $et->GetNewValue ( "ItemList:ID_albm:Album" );
say STDOUT $et->GetNewValue ( "UserData:ID_albm:Album" );

$et->WriteInfo ( $file );
$et->ImageInfo ( $file );

say STDOUT $et->GetValue ( "ItemList:ID_a9alb:Album" );
say STDOUT $et->GetValue ( "UserData:ID_a9alb:Album" );
say STDOUT $et->GetValue ( "ItemList:ID_albm:Album" );
say STDOUT $et->GetValue ( "UserData:ID_albm:Album" );

say STDOUT $et->GetValue ( 'Album', { Group => "ItemList:ID_a9alb" } );
say STDOUT $et->GetValue ( 'Album', { Group => "UserData:ID_a9alb" } );
say STDOUT $et->GetValue ( 'Album', { Group => "ItemList:ID_albm"  } );
say STDOUT $et->GetValue ( 'Album', { Group => "UserData:ID_albm"  } );


We expect this: (Line numbers added.)


1   a
2   b
3   c
4   d
5   a
6   b
7   c
8   d
9   a
10  b
11  c
12  d


I'm getting this: (No less strangely, b was assigned to a tag in the least preferred group: UserData.)










9   b
10  b
11  b
12  b


GetValue returns an empty list at line 2969 in ExifTool.pm (v12.02):


2966     # start with the raw value
2967     my $value = $$rawValue{$tag};
2968     if (not defined $value) {
2969         return () unless ref $tag;
2970         # get the value of a structure field


At this point, $value comes out undefined, because the $rawValue href has no ItemList:ID_a9alb:Album key, since lines 2954--7 failed to set $tag with a proper tag name: Album ($i).


2954         if (@keys) {
2955             $key = $self->GroupMatches($gp, \@keys);
2956             $tag = $key if $key;
2957         }


SaveIDGroup option
Quote from: Phil Harvey on July 28, 2020, 09:03:41 PM
You get use $et->GetGroup($key, 7) to get the tag ID group name.  The complication is that you first have to set the ExifTool SaveIDGroup option so it generates this name for you.
Great! Thanks for the new SaveIDGroup option. It works.

ImageInfo()

Quote from: Phil Harvey on July 28, 2020, 09:03:41 PM
I really try to avoid adding dedicated functions like this.  The code will soon become a rats nest of functions tailored to the different ways that people want the information returned.

Fair enough. :D Using the new SaveIDGroup option, I manage to get my hash the way I ... kind of... wanted thus:


$et->Options ( SaveIDGroup => 1 );
my $ItemList = $et->ImageInfo ( $file, { Group1 => 'ItemList' } );
my $UserData = $et->ImageInfo ( $file, { Group1 => 'UserData' } );
my %ilst = map { $et->GetGroup ( $_, 7 ) => $ItemList->{$_} } keys %$ItemList;
my %udta = map { $et->GetGroup ( $_, 7 ) => $UserData->{$_} } keys %$UserData;
my $info = { ItemList => \%ilst, UserData => \%udta };


If I only had a reliable way of getting ©alb keys instead of ID_a9alb keys... but alas!  :-\

Phil Harvey

Thanks for testing this.

Quote from: dae65 on July 29, 2020, 06:13:13 AM

say STDOUT $et->GetNewValue ( "ItemList:ID_a9alb:Album" );
say STDOUT $et->GetNewValue ( "UserData:ID_a9alb:Album" );
say STDOUT $et->GetNewValue ( "ItemList:ID_albm:Album" );
say STDOUT $et->GetNewValue ( "UserData:ID_albm:Album" );

Right.  GetNewValue() only works with family 0 and 1 groups.  I'll have to look into this.

Quotesay STDOUT $et->GetValue ( "ItemList:ID_a9alb:Album" );
say STDOUT $et->GetValue ( "UserData:ID_a9alb:Album" );
say STDOUT $et->GetValue ( "ItemList:ID_albm:Album" );
say STDOUT $et->GetValue ( "UserData:ID_albm:Album" );

This would work if you had set the SaveIDGroup option.  (I'm trying to figure out a way to avoid this.)

Quotesay STDOUT $et->GetValue ( 'Album', { Group => "ItemList:ID_a9alb" } );
say STDOUT $et->GetValue ( 'Album', { Group => "UserData:ID_a9alb" } );
say STDOUT $et->GetValue ( 'Album', { Group => "ItemList:ID_albm"  } );
say STDOUT $et->GetValue ( 'Album', { Group => "UserData:ID_albm"  } );

This doesn't work because GetValue() doesn't have a Group option.

Quote
If I only had a reliable way to get ©alb instead of ID_a9alb.... but alas!  :-\

Did you see the GetTagID() function?

- Phil

Edit: Also, I'm leaning towards changing the prefix to "ID-" instead of "ID_" because it is more consistent with the way other groups are named (specifically the "XMP-" groups)
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

This change also has ramifications when copying.  One would like to have the ability to copy tags from one file to another with the same tag ID.  Usually this would be accomplished on the command line via:

exiftool -tagsfromfile SRCFILE -all:all DSTFILE

where -all:all is the incantation to make ExifTool preserve the family 1 group (location) of the tag.  I'll have to enhance this to also allow family 7 (tag ID) to be maintained.  So the command would be:

exiftool -tagsfromfile SRCFILE -all:7all:all DSTFILE

which I admit is sort of confusing.  But at least this will be possible (with the next release).

This corresponds to adding the ability to specify family 7 group names in SetNewValuesFromFile().  I'm currently testing a new version with this enhancement and also the ability to specify family 7 group names in GetNewValue() as you pointed out.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

ExifTool 12.03 is now available.

As well as the changes I mentioned, you no longer need to use the API SaveIDGroup option to access the family 7 group names.

Note that these group names now start with "ID-" instead of "ID_".

And this feature is now documented.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).