Map directory tree/folder structure onto Hierarchical Subject

Started by exitfool, December 14, 2012, 02:27:57 PM

Previous topic - Next topic

exitfool

Hi,

I have some 35,000+ images in Lightroom. They've been organized manually and the directory tree follows a personal vocabulary. It's time to move over to keywording and it would be great if exiftool could map the directory tree structure onto Lightroom's Hierarchical Subjects. Something like:

exiftool -XMP:HierarchicalSubject="PORTFOLIO|NATURE|ANIMALS" IMG.jpg

except that we need to recurse and build up the hierarchy from the folder structure. Is this is a common use case Phil? This is on a Mac, so I have access to the Unix command-line if we need to build this up from other commands.

Thanks.

Phil Harvey

This is easy to do.  The only trick is that you need to replace the "/" in Directory with "|", which can be done with a user-defined Composite tag and this conversion:

    ValueConv => '$val =~ tr{/}{|}; $val',

If you want more details, search for "UserDefined Composite FileName" in this forum because there are examples of doing similar things with the FileName.

The command would then be:

exiftool "-hierarchicalsubject+=mydirectory" -r DIR

to add the directory as a HierarchicalSubject to all images in directory DIR and sub-directories (assuming your Composite tag is called "MyDirectory").  Also note that this will create "_original" backups of all your images.

But it will be up to you to figure out how to get LR to re-load the metadata from these files (I suspect that this isn't automatic).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

exitfool

Thanks for the detailed reply Phil. My config file loads:

print "LOADED!\n";

%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Composite' => {
        MyDirectory => {
            Require => 'Directory',
            ValueConv => '$val =~ tr{/}{|}; $val',
        },
    },
);

but the values are not getting interpolated:

exiftool '-hierarchicalsubject+=mydirectory' -r DIR

yields:

Hierarchical Subject            : MyDirectory

I've tried adding {}, ${}, etc but I think I'm missing something more fundamental in the config file.

Thanks again.

exitfool

This seems a bit clunky, but works (I think, still testing)...
Change ExifTool_config to:

        HierarchicalSubject => {

and

exiftool "-Keywords=<$mydirectory"

Phil Harvey

Sorry, my command was wrong.  It should have been

exiftool "-hierarchicalsubject+<mydirectory" -r DIR

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

exitfool

That introduces a complication. PORTFOLIO|NATURE|ANIMALS now becomes:

.|PORTFOLIO|NATURE|ANIMALS with a relative directory (such as .)

or

|home|user|test|PORTFOLIO|NATURE|ANIMALS for an absolute directory (such as /home/user/test/)

So both start with an unwanted bit. Interestingly, my method of using HierarchicalSubject in ExifTool_config makes this work as expected with both relative and absolute paths.

exitfool

Hmm... it looks like setting HierarchicalSubject from within the .config is reporting the correct values but they're not actually set in image files (they're been masked by being created on the fly from the .config?).

So it's back to Phil's method, but I'm not sure how to inject that much perl into the .config file so I'm using an external script:

$ find . -type d -print0 | perl ../perl/tree.pl > todo.sh
$ chmod +x todo.sh
$ ./todo.sh

tree.pl generates:

exiftool -hierarchicalsubject="c" ./c
exiftool -hierarchicalsubject="c|coffee" ./c/coffee
exiftool -hierarchicalsubject="c|craters" ./c/craters
exiftool -hierarchicalsubject="a" ./a
exiftool -hierarchicalsubject="a|antartica" ./a/antartica
exiftool -hierarchicalsubject="a|animals" ./a/animals

it contains (excuse my rusty perl):

use strict;
use warnings;
use File::Spec;

while (<>) {
    chomp( $_ ); # remove possible newline
    my @lines = split('\0', $_); # split on null character from find -print0

    foreach my $line (@lines) {
        my @dirs  = File::Spec->splitdir( $line );

        shift( @dirs ); # remove prefix . from find
        next if (!@dirs);

        print 'exiftool -hierarchicalsubject="' .
#            '-overwrite_original_in_place'      .
            join( '|', @dirs )                  .
            '" '                                .
            $line                               .
            "\n";
    }
}

Phil Harvey

Yes.  You need to either specify the directory exactly as you want the hierarchical keywords.  For example:

cd Pictures
exiftool "-hierarchicalkeywords+<mydirectory" *


Or it is easy to remove the first directory in the ValueConv of MyDirectory:

            ValueConv => q{
                $val =~ tr{/}{|};    # translate "/" to "|"
                $val =~ s/^.*?\|//;  # remove the root directory
                return $val;
            },


(gotta luv Perl when you can generate an expression containing "\|//" that actually makes sense.)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

exitfool

Thanks. Using * instead of . works with -r. I find it somewhat peculiar that sematically identical globs produce different output. Your patch makes it work regardless.

I think you did a typo with:

exiftool "-hierarchicalkeywords+<mydirectory" *

and that should be:

exiftool "-hierarchicalsubject+<mydirectory" *

Phil Harvey

Quote from: exitfool on December 17, 2012, 07:52:40 PM
I find it somewhat peculiar that sematically identical globs produce different output.

Interesting, because it makes perfect sense to me.  The directory that you specify is the first directory in the hierarchy.  If you specify ".", then that comes first.  "." is not a glob, it is a directory name.

However, "*" is a glob, and expands to the names of all files (subdirectories) in the current directory, so these subdirectory names are then first.

QuoteI think you did a typo[...]

Quite right.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

exitfool

It seems the only way to get Lightroom to repopulate it's keyword list from exiftool modified images, is by reimporting the images (somebody please correct me!). As they're already in Lightroom and have been developed this is a no-go. After all the trouble it appears I've reached a dead-end. Sorry for going off-topic.

Phil Harvey

I suggest asking this question in a Lightroom forum.  There must be some way to do it.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

exitfool

Ok. I've looked more closely at this. Lightroom can import metadata changes made to .xmp (sidecar) files. I tried with:

exiftool "-HierarchicalSubject=Africa, Africa|Animals, Africa|Animals|Warthogs" DSC_0023.xmp

but Lightroom sees nothing. Setting that structure in LR generates:

   <lr:hierarchicalSubject>
    <rdf:Bag>
     <rdf:li>Africa</rdf:li>
     <rdf:li>Africa|Animals</rdf:li>
     <rdf:li>Africa|Animals|Warthogs</rdf:li>
    </rdf:Bag>
   </lr:hierarchicalSubject>

while exiftool generates:

<lr:hierarchicalSubject>
   <rdf:Bag>
    <rdf:li>Africa, Africa|Animals, Africa|Animals|Warthogs</rdf:li>
   </rdf:Bag>
  </lr:hierarchicalSubject>

Can we get exiftool to more closely match LR's output to see if will work?

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).