Picasa.ini to XMP-mwg-rs/XMP-MP Region tags

Started by StarGeek, March 03, 2015, 05:16:54 PM

Previous topic - Next topic

StarGeek

I'm currently trying to make User-Defined tags that will read the .picasa.ini file that might be in the same directory as target file and check to see if there are any facial recognition regions for the target file.  If so, it will create the  XMP-mwg-rs (Media Work Group/Picasa) and XMP-MP (Microsoft Photo) Region tags so they can be inserted directly into a file without having to use some other program.

I seem to have most of it correct, as the numbers are coming out nearly the same as when I use AvPicFaceXmpTagger, at worst it's only 0.0000001 off.  But I seem to have messed up the returning of data from the tag.  When I add -struct, it returns data which matches the XMP-mwg-rs and XMP-MP data already in my file.  But without that option, I get output similar to this:
Picasa Ini To MWG Region        : HASH(0x3268b74)
Picasa Ini To MP Region         : HASH(0x3268934)

while the conversion tags from this thread (which this is heavily based upon) won't show any info without the -struct option.

Hopefully, there's just a simple mistake that's due to my lack of knowledge about Perl.  Any chance you can take a look at this Phil?

%Image::ExifTool::UserDefined = (
'Image::ExifTool::Composite' => {
PicasaIniToMWGRegion => {
Require =>
{
0 => 'Directory',
1 => 'FileName',
2 => 'ImageWidth',
3 => 'ImageHeight',
},
ValueConv => q{
my ( %ImageReg, $Filename , $PicasaIni, $section, %ContactHash, @RegList);
my $DoneFlag = 0;
$PicasaIni= "$val[0]/.picasa.ini";
$Filename = $val[1];
$ContactHash{'ffffffffffffffff'}='ffffffffffffffff'; #Picasa's default setting for unnamed faces.
open (INI, $PicasaIni) || return undef;
while ( ($_=<INI>) && !$DoneFlag) {
chomp;
if (/^\s*\[(.+)\](?:$)/) {
$section = $1;
}
# First Section should be the contact list.  Put that into a hash so we can pull the names out later
if ($section eq 'Contacts2')
{
if (/^([\da-f]{1,16})=(.*);;$/) 
{
my $keyword = $1;
my $value = $2 ;
# put them into hash
$ContactHash{$keyword} = $value;
}
}
# After the contact list, the following sections are all the filenames if they have
elsif ($section eq $Filename)
{
$DoneFlag=1;
$_=<INI>;
chomp;
if (/^faces=(.*)$/)  #(?:rect64\([\da-f]{12,16}\),[\da-f]{14,16};?)+
{
my @temp = split(/;/,$1);
foreach (@temp)
{
/rect64\(([\da-f]{12,16})\),([\da-f]{14,16})/;
my $contact=$2;
my $left  = substr ($1,0,-8);
my $right = substr ($1,-8);
# using 7 as arbitrary.  I've seen lengths vary from 6 to 8
my @rec;
$rec[0] = sprintf("%.7f",((hex($left) & 0xFFFF0000)>>16)/65535);
$rec[1] = sprintf("%.7f",(hex($left) & 0xffff)/65535);
$rec[2] = sprintf("%.7f",((hex($right) & 0xFFFF0000)>>16)/65535);
$rec[3] = sprintf("%.7f",(hex($right) & 0xFFFF)/65535);
%ImageReg=(
Area =>
{
X => $rec[0]+($rec[2]-$rec[0])/2,
Y => $rec[1]+($rec[3]-$rec[1])/2,
W => $rec[2]-$rec[0],
H => $rec[3]-$rec[1],
Unit => 'normalized',
},
Name => $ContactHash{$contact},
Type => 'Face',
);
push @RegList, \%ImageReg;
}
}
}
}
close (INI);
return
{
AppliedToDimensions => { W => $val[2], H => $val[3], Unit => 'pixel' },
RegionList => \@RegList,
}
},
},
PicasaIniToMPRegion =>
{
Require =>
{
0 => 'Directory',
1 => 'FileName',
},
ValueConv => q{
my ($Filename , $PicasaIni, $section, %ContactHash, @RegList);
my $DoneFlag = 0;
$PicasaIni= "$val[0]/.picasa.ini";
$Filename = $val[1];
$ContactHash{'ffffffffffffffff'}='ffffffffffffffff'; #Picasa's default setting for unnamed faces.
open (INI, $PicasaIni) || return undef;
while ( ($_=<INI>) && !$DoneFlag) {
chomp;
if (/^\s*\[(.+)\](?:$)/) {
$section = $1;
}
# First Section should be the contact list.  Put that into a hash so we can pull the names out later
if ($section eq 'Contacts2')
{
if (/^([\da-f]{1,16})=(.*);;$/) 
{
my $keyword = $1;
my $value = $2 ;
# put them into hash
$ContactHash{$keyword} = $value;
}
}
# After the contact list, the following sections are all the filenames if they have
elsif ($section eq $Filename)
{
$DoneFlag=1;
$_=<INI>;
chomp;
if (/^faces=(.*)$/) 
{
my @FaceRegions = split(/;/,$1);
foreach (@FaceRegions)
{
/rect64\(([\da-f]{12,16})\),([\da-f]{14,16})/;
my $contact=$2;
my $left  = substr ($1,0,-8);
my $right = substr ($1,-8);
# using 7 as arbitrary precision.  I've seen lengths vary from 6 to 8
my @rec;
$rec[0] = sprintf("%.7f",((hex($left) & 0xFFFF0000)>>16)/65535);
$rec[1] = sprintf("%.7f",(hex($left) & 0xffff)/65535);
$rec[2] = sprintf("%.7f",((hex($right) & 0xFFFF0000)>>16)/65535) - $rec[0];
$rec[3] = sprintf("%.7f",(hex($right) & 0xFFFF)/65535) - $rec[1];


push @RegList,
{
PersonDisplayName => $ContactHash{$contact},
Rectangle => join(', ', @rec),
};
}
}
}
}
close (INI);
return { Regions => \@RegList };
},
},
},
);
#------------------------------------------------------------------------------
1;  #end


edit: better alignment
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

Wow, that's very interesting.  Creating a user-defined Composite Struct tag when there is no -struct option in effect.

You aren't doing anything wrong Perl-wise.  The problem is that ExifTool has no mechanism to generate flattened tags from a user-defined Struct tag.  Thinking a bit about this, I don't even know how I would go about adding this ability.

But do you really need this?  Perhaps it would be sufficient to just return the serialized structure without the -struct option, like this:

my $struct = {
AppliedToDimensions => { W => $val[2], H => $val[3], Unit => 'pixel' },
RegionList => \@RegList,
};
unless ($self->Options('Struct')) {
require 'Image/ExifTool/XMPStruct.pl';
$struct = Image::ExifTool::XMP::SerializeStruct($struct);
}
return $struct;


Note that the MyRegion tag in the thread you referenced is never extracted without the -struct option, so you aren't exposed to the underlying HASH as you are in your case.  An alternative is to just return undef unless the Struct option is set.  I'm not really sure if somehow generating flattened Composite tags is an alternative.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: Phil Harvey on March 03, 2015, 07:31:42 PMThe problem is that ExifTool has no mechanism to generate flattened tags from a user-defined Struct tag.  Thinking a bit about this, I don't even know how I would go about adding this ability.

Sorry if I'm not making myself clear.  I don't need flattened tags.  I want to create a tag that will create tags from the picasa.ini that I can insert into -RegionInfo and -RegionInfoMP.  So I'm looking for something like this as the end command:

exiftool -config PicasaIniConvert  "-regioninfo<PicasaIniToMWGRegion" "-regioninfomp<PicasaIniToMPRegion" FILE/DIR

To be honest, I don't quite understand the -struct option, but I'm certainly willing to use if it's what is needed to get this idea to work.  I've tried adding it at various times in my testing, but never with any results.

* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

StarGeek

Hold on, I think I was making a stupid mistake and I was not using the right config file.  I think it's working :D

Yep, it looks like a stupid mistake on my end.  I may be getting hash when I treat it as a flatten tag, but when I try to copy it, it appears to be working.  I need to do some more testing, then tweak the code a bit, then try and add in the bit from the other thread but test for a change in directory, but now it's looking good.

Thanks for the feedback.  I'm sure I'll have a few more questions later.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

#4
Right.  It should be fine to copy this tag to the appropriate structured tag.  The problem is only if you try to display it when extracting.

This is exciting because there have been a few requests for this ability.  If you come up with something that other people can use, perhaps I can add it to the ExifTool distribution.

- Phil

Edit:  To keep from re-loading the INI file each time, you could define this in your config file:

$Image::ExifTool::myIniFile = '';
@Image::ExifTool::myIniData = ();

sub Image::ExifTool::LoadINI($)
{
    my $iniFile = shift;
    if ($iniFile eq $Image::ExifTool::myIniFile) {
        return undef unless @Image::ExifTool::myIniData;
    } else {
        $Image::ExifTool::myIniFile = $iniFile;
        undef @Image::ExifTool::myIniData;
        local *INI;
        open(INI, $iniFile) or return undef;
        while (<INI>) {
            chomp;
            push @Image::ExifTool::myIniData, $_;
        }
        close(INI);
        @Image::ExifTool::myIniData or return undef;
    }
    return 1;
}


Then the first statement in your ValueConv could be "LoadINI("$val[0]/.picasa.ini") or return undef;", and you would parse @Image::ExifTool::myIniData in your loop:

foreach (@Image::ExifTool::myIniData) {
...
}


Or better yet, since you seem to be parsing the entire file each time, you could save the necessary reduced data instead of the raw lines of the file, and save both memory and parsing time for subsequent files.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: Phil Harvey on March 04, 2015, 07:25:17 AM
Or better yet, since you seem to be parsing the entire file each time, you could save the necessary reduced data instead of the raw lines of the file, and save both memory and parsing time for subsequent files. 

This is actually what I was planning on doing next.  It's actually going to be required.  I had assumed that the .picasa.ini file had the contact list at the top, since that is what I saw in my tests.  But checking with other directories, it seems that it can appear anywhere in the file.  That led to regions that didn't have names attached to them.

Additionally, since each directory would have a different .picasa.ini, I'll have to make sure to check that the directory hasn't changed.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

Quote from: StarGeek on March 04, 2015, 10:27:33 AM
Additionally, since each directory would have a different .picasa.ini, I'll have to make sure to check that the directory hasn't changed.

My code shows one way to do this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: Phil Harvey on March 04, 2015, 11:13:46 AM
My code shows one way to do this.

D'oh! 

I think I just about have it.  There's still one error to work out, then I want to test it a bit, but the output data looks good. 

I've learned some good stuff going through this, but still have some things about perl I haven't been able to wrap my head around.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

One source of confusion when writing complex ValConv expressions like this is to remember that they are evaluated in the Image::ExifTool namespace, which is why I needed to use variables like $Image:ExifTool:Xxxx.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Upon expanding my testing I've hit a snag.  The .picasa.ini file doesn't always contain all the person names that might be in that directory.  So I'll have to load up Picasa's  Contacts.xml file.  Which leads to two questions.

Is there a way for me to specify a path to the Contacts.xml file on the command line?  Something like -PicasaContactsFile=/path/to/contacts.xml?  Otherwise I guess it will have to be hardcoded.

Second, is there something built into ExifTool that I could use to read/parse an XML file before I go and write something?
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

StarGeek

The final snag is having to decode html entities.  Picasa html encodes some of the characters in the contact list.  The usual suspects such as quotes, greater/less than signs, ampersands, etc.  HTML::Entities doesn't seem to be available, so if there's not another option available, I'll have to add one in that will cover the basics.

Other than these problems, it looks like it's ready.  I tested it against a directory with 1,592 files that already had RegionInfo and RegionInfoMP tags which were previously inserted with AvPicFaceXmpTagger and the conversion user tags.  I compared them by truncating the numbers down 6 decimal places. That gave me 170 files that were off by .000001.  Truncating down to 5 decimal places gave only 7 were off by .00001 and those were the difference between numbers like 0.09131 and 0.0913099.  Since these differences are less than a fraction of a pixel size (unless the picture is huge), I wouldn't think they are worth worrying about.

Other than these issues, it's just about ready for use.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

Hi StarGeek,

Quote from: StarGeek on March 05, 2015, 03:37:02 PM
Is there a way for me to specify a path to the Contacts.xml file on the command line?

I was actually thinking about this when you started this thread.  There may be a way (by creating a user-defined Writable Composite tag to set this path).  It will be a bit tricky though, and I've never tried it before.  Also, I'll be away tomorrow, so I won't be able to help with this until the weekend.

QuoteSecond, is there something built into ExifTool that I could use to read/parse an XML file before I go and write something?

You could create a new Image::ExifTool object, and call ImageInfo with a reference to the XML data.  This is quite do-able.

Quote from: StarGeek on March 05, 2015, 06:46:28 PM
The final snag is having to decode html entities.  Picasa html encodes some of the characters in the contact list.  The usual suspects such as quotes, greater/less than signs, ampersands, etc.  HTML::Entities doesn't seem to be available, so if there's not another option available, I'll have to add one in that will cover the basics.

Sure, just do this:

require Image::ExifTool::HTML;
my $unescaped = Image::ExifTool::HTML::UnescapeHTML($val);


QuoteOther than these problems, it looks like it's ready. [...]

Excellent!

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: Phil Harvey on March 05, 2015, 07:37:13 PM
There may be a way (by creating a user-defined Writable Composite tag to set this path).  It will be a bit tricky though, and I've never tried it before.  Also, I'll be away tomorrow, so I won't be able to help with this until the weekend.

If you think it's tricky, then I have no clue where to even start.  I'm already messing with forces beyond my comprehension :D

QuoteYou could create a new Image::ExifTool object, and call ImageInfo with a reference to the XML data.  This is quite do-able.

I played around with this but wasn't able to get anything to work until I altered the file.  Picasa doesn't put the declaration(?) at the top of the file.  When I added that, I was able to get some output, but I don't think it'll be within my ability to deal with it.  Here's a short example of a contacts.xml file:
<contacts>
<contact id="423ab0d97fa066db" name="George" modified_time="2015-03-05T13:30:35-08:00" local_contact="1"/>
<contact id="520b28d9aa4039aa" name="Andy" modified_time="2015-03-02T11:17:20-08:00" local_contact="1"/>
<contact id="25e26bd972416fa1" name="Ed" modified_time="2015-03-02T11:17:20-08:00" local_contact="1"/>
</contacts>


I'm currently reading it line by line and using a simple regex to pull out the id and name.  It works well enough. 

Unless someone else really wants to play with it, I'll wait until you get back and can give me some clues on the user-defined Writable Composite tag idea before I post it.  That will give me some time to put it to work and see if any other bugs pop up.   Plus I want to create a version that will filter out the unknown names (where Name=ffffffffffffffff).

Thanks for all your help!
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

Hi StarGeek,

Quote from: StarGeek on March 05, 2015, 10:12:00 PM
If you think it's tricky, then I have no clue where to even start.  I'm already messing with forces beyond my comprehension :D

That's how we learn. :)  I've been really impressed with your ingenuity so far.

I've added a new UserParam API option to ExifTool 9.89 that will make this easier.  The syntax is:

exiftool -api userparam=PicasaContactsFile=/path/to/contacts.xml

(I thought about adding a -userparam exiftool option to make this simpler, but I'll hold off for that now since you can go through the -api option, even though it is a bit confusing with the two equal signs.)

Then in your ValueConv, you can access this via $self->Options(UserParam => 'PicasaContactsFile');

Also, you can access it in any expression involving tag names with $picasacontactsfile.

Quote
QuoteYou could create a new Image::ExifTool object, and call ImageInfo with a reference to the XML data.  This is quite do-able.

I played around with this but wasn't able to get anything to work until I altered the file.  Picasa doesn't put the declaration(?) at the top of the file.  When I added that, I was able to get some output, but I don't think it'll be within my ability to deal with it.  Here's a short example of a contacts.xml file:
<contacts>
<contact id="423ab0d97fa066db" name="George" modified_time="2015-03-05T13:30:35-08:00" local_contact="1"/>
<contact id="520b28d9aa4039aa" name="Andy" modified_time="2015-03-02T11:17:20-08:00" local_contact="1"/>
<contact id="25e26bd972416fa1" name="Ed" modified_time="2015-03-02T11:17:20-08:00" local_contact="1"/>
</contacts>


I'm currently reading it line by line and using a simple regex to pull out the id and name.  It works well enough. 

Yeah, XML is a pain.  Doing this with ExifTool may not be any easier since this XML structure doesn't map into unique tag names.  You would have to deal with duplicate tags with names like ContactsContactId and ContactsContactName.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Is it possible to access the ExifTool version number without adding it to Require part of the tag?  The contact file loading is in a subroutine I'd prefer to access it directly if possible.

I want to try and keep this as compatible as possible and not force an upgrade for a feature that not everyone would use.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).