Re: Create and populate directory trees

Started by sjDelaney, May 08, 2015, 08:04:28 PM

Previous topic - Next topic

sjDelaney

I need to create and populate directories\folders based on "Keywords" extracted from jpeg image files.



Some Background...

I use keywords to group, sort, and filter on names of pet owners and their dogs...  e.g. "~ Sandy Miller: Duchess", "~ Sandy Muller: Duke", "~Bill Smith: Beaux"...
I use other tags as well (e.g. {Dog Club}, {Facility}, "AKC", "Brag shot", etc.) within Lightroom, which I hope can be excluded when creating folders.

A sample image would contain a string of keywords: " AKC, FVDTC, ~ Sandy Miller: Duchess: Australian Shepherd, Masters Standard, Group A "  . Note I use Tilde "~' is as a special character to sort and identify these keywords and colons as separators for what is essentially a concatenated string. Before going on about why I do this, and there's a better way...  let me just say...  I have done it for years, it's quick an easy, and it works for me within Lightroom. That said, The data is structured and accessible, and I want to put it to extend it's use through ExifTool by creating and populating structured directories on a hard drive.

The Raw (NEF) files are managed from folders that are created based on the date the images are shot... 20150311, 20150415...
File names are always date-time-sequence e.g. 20150421-110426, 20150421-110426-2, 20150421-110426-3...
Jpegs are generated as proofs and/or finished image files after post-processing from LR(NEF) and/or Photoshop(PSD).
All new jpegs contain the keywords passed on to the from the generating application.
Jpeg keywords can/will be modified without affecting RAW/NEF/PSD files and will have no impact the LR catalog (unless a plugin like Syncomatic is used, which is rare).

While my Raw files are managed well within this directory structure...

\Old Root\20150421\


I need to copy jpegs into a structured tree like this...

\New Root\               Folder I select at runtime
      \Sandy Miller\            Based on Keyword
             \Duchess\            Subfolder Based on Keyword (and special character???)
             \Duke\                    Subfolder Based on Keyword (and special character???)
       \Bill Smith\                    Based on Keyword
        \Beaux\            Subfolder Based on Keyword (and special character???)
   \...
     
OR
\New Root\               Folder I select at runtime
       \Sandy Miller:Duchess\      Based on Keyword
       \Sandy Miller: Duke\         Based on Keyword
       \Bill Smith: Beaux\         Based on Keyword
   \...

I need to embed these steps within a process so that
1. LR ingests the raw files into a flat folder
2. Keywords are added to the LR catalog
3. Jpegs are generated to a separate folder (not added or recognized by the LR catalog)
4. All read, find/replace and copy actions are done on jpegs (outside of LR)
5. A populated tree structure is created

6. Which I can then upload to my website (Zenfolio).

I hope you can help me with steps 4. & 5. 

Before I go on...
  • "Keywords" is that field that Lightroom updates and filters from it's Library Panel and is referred to in Lightroom as "Keywords" - and is recognized as "Keywords" by Photo Mechanic, iPhoto, Amok, and most other Windows/MAC image software.
  • Whether that field resides as exif or IPTC, in an XMP or DNG or jpeg is not strictly part of my question
  • I'm not looking for a better way to tag images in Lightroom
  • I AM looking for a method to parse quasi structured data (above and beyond createdate %y.%d.%m) into to exifTool to create and copy into directories based on a metadata element commonly known as "Keywords".

I've researched various technical forums for quite awhile now, read much of the documentation, and I do see the examples for creating folders based on createdate %y/%d/%m, but I see no samples using other data elements i.e keywords, although the capability is explicitly mentioned which strongly suggests it can be done. I just haven't been able to pull the syntax together that will pipe keywords into ExifTool create folder\subfolders as output.

I realize and appreciate the power of ExifTool, the wide berth of acceptance it has within computing and digital imaging communities, and the time and practice it undoubtedly takes to become a reasonably competent user.  BUT, right now I'm stuck, and I could really use an answer to this riddle.  I'm sure the answer will become obvious to me once I see it.

Thanks for your time.  I know you're all extremely busy, so if anyone can direct me to someone I can reach out to, who can possibly step me through this, I'd greatly appreciate it. 

-sd
 

Steve Delaney
steve@sjdelaney.com
847-431-4681

Phil Harvey

Hi Steve,

This command will copy JPEG images from a source folder to a hierarchy based on the pet owners and the pet names as you described:

exiftool -o dummy/ "-directory<OUTDIR/${subject;$_=/~ *(.*?) *: *(.*?) *:/ ? qq($1/$2) : undef}" -ext jpg DIR

Here I have assumed that the keywords are stored in XMP:Subject.  You should change OUTDIR to the root output folder name, and DIR to the input folder name.  Add a -r option to also copy JPG's from nested folders within DIR.

It might also be worthwhile to add a -v option so you can get some indication of what ExifTool is doing.

This solution uses the advance formatting expression feature of ExifTool, with a Perl expression that looks for the tilde and colons and extracts the strings between them (removing spaces at either end), then separates them with a slash for use as a directory name.

I don't know if you found this page, but example 5 explains why the -o dummy/ is needed.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

sjDelaney

Thanks Phil, I'm almost there.

Here is the syntax I'm using with a sample of the results...

MBP:~ sjD$ exiftool -o dummy/ '-directory</Volumes/"LaCie 01"/ET2/${subject;$_=/~ *(.*?) *: *(.*?) *:/ ? qq($1/$2) : undef}' -ext jpg /Volumes/"LaCie 01"/LR2

Warning: [minor] Tag 'subject' not defined - /Volumes/LaCie 01/LR2/20150411-085549-2.jpg
Warning: No writable tags set from /Volumes/LaCie 01/LR2/20150411-085549-2.jpg
Error: 'dummy/20150411-085549-2.jpg' already exists - /Volumes/LaCie 01/LR2/20150411-085549-2.jpg
....
Warning: [minor] Tag 'subject' not defined - /Volumes/LaCie 01/LR2/20150411-171332.jpg
Warning: No writable tags set from /Volumes/LaCie 01/LR2/20150411-171332.jpg
Error: 'dummy/20150411-171332.jpg' already exists - /Volumes/LaCie 01/LR2/20150411-171332.jpg
    1 directories scanned
    0 image files updated
   56 files weren't updated due to errors
MBP:~ sjD$

I'm attaching a screenshot showing that the data I want to read is embedded in the somewhere in the file. While OX Finder refers to IPTC IIM/Keywords"aka "Keywords", other apps i.e. Windows Live Photo Gallery uses xmp/dc:subject to display "Descriptive Tags". I'm sure I'm not putting the same data in two locations; am I?? Both apps display the same information from the same file. One calls it Keywords and the other calls it Descriptive Tags - and ExifTool calls it... Subject?

Phil Harvey

One problem that your output points out:  If Subject doesn't exist, the file is copied to dummy/.  I didn't think about this, but I guess it isn't a big problem (just delete dummy/ afterwards).

See FAQ 2 to determine the name of the tag containing your keywords.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

sjDelaney

#4
Thanks,

I executed exiftool -s /Volumes/"LaCie 01"/LR2/20150411-171332.jpg which presented the following information for one jpg. I see that I do have the same data stored in more than one location; Subject, Keywords, and HierarchicalSubject.  So I tried each "tag" with exiftool -o...  using the same file, with the results you see at the bottom.  (I left the dummy/ parm in since it isn't really interfering).  Like cable tv... more choices, none good.

ExifToolVersion                 : 9.94
FileName                        : 20150411-171332.jpg
Directory                       : /Volumes/LaCie 01/LR2
FileSize                        : 642 kB
FileModifyDate                  : 2015:05:09 09:41:33-05:00
FileAccessDate                  : 2015:05:09 12:22:49-05:00
FileInodeChangeDate             : 2015:05:09 09:41:33-05:00
FilePermissions                 : rw-r--r--
FileType                        : JPEG
FileTypeExtension               : jpg
MIMEType                        : image/jpeg
ExifByteOrder                   : Little-endian (Intel, II)
Orientation                     : Horizontal (normal)
XResolution                     : 300
YResolution                     : 300
ResolutionUnit                  : inches
Software                        : Adobe Photoshop Lightroom 6.0 (Macintosh)
ModifyDate                      : 2015:05:09 09:41:33
Artist                          : Steve Delaney
Copyright                       : Pawprint Pictures
ExifVersion                     : 0230
DateTimeOriginal                : 2015:04:11 17:13:32
CreateDate                      : 2015:04:11 17:13:32
SubSecTimeOriginal              : 03
SubSecTimeDigitized             : 03
Compression                     : JPEG (old-style)
ThumbnailOffset                 : 432
ThumbnailLength                 : 13912
XMPToolkit                      : Adobe XMP Core 5.6-c011 79.156380, 2014/05/21-23:38:37
CreatorTool                     : Adobe Photoshop Lightroom 6.0 (Macintosh)
MetadataDate                    : 2015:05:09 09:41:33-05:00
Label                           : Yellow
Format                          : image/jpeg
DocumentID                      : xmp.did:fc8bdf59-1027-441a-a0f4-d8413b0ade7d
OriginalDocumentID              : 24BB68426C6198311656FCCD434ECF34
InstanceID                      : xmp.iid:bb6b6a7b-1da0-476c-8fb4-e89b410c72b0
RawFileName                     : 20150411-171332.jpg
Creator                         : Steve Delaney
Rights                          : Pawprint Pictures
Subject                         : ~ Alan Smith: Wrap
DerivedFromInstanceID           : xmp.iid:1f6a4880-e44a-4320-bd34-b18e693e131a
DerivedFromDocumentID           : 24BB68426C6198311656FCCD434ECF34
DerivedFromOriginalDocumentID   : 24BB68426C6198311656FCCD434ECF34
HistoryAction                   : saved
HistoryInstanceID               : xmp.iid:bb6b6a7b-1da0-476c-8fb4-e89b410c72b0
HistoryWhen                     : 2015:05:09 09:41:33-05:00
HistorySoftwareAgent            : Adobe Photoshop Lightroom 6.0 (Macintosh)
HistoryChanged                  : /metadata
HierarchicalSubject             : ~ Alan Smith: Wrap
DisplayedUnitsX                 : inches
DisplayedUnitsY                 : inches
CurrentIPTCDigest               : f4c0367bfb7ab46f2108303e43bdaeeb
CodedCharacterSet               : UTF8
ApplicationRecordVersion        : 4
Keywords                        : ~ Alan Smith: Wrap
DateCreated                     : 2015:04:11
TimeCreated                     : 17:13:32
DigitalCreationDate             : 2015:04:11
DigitalCreationTime             : 17:13:32
By-line                         : Steve Delaney
CopyrightNotice                 : Pawprint Pictures
PhotoshopThumbnail              : (Binary data 13912 bytes, use -b option to extract)
IPTCDigest                      : f4c0367bfb7ab46f2108303e43bdaeeb
ProfileCMMType                  : Lino
ProfileVersion                  : 2.1.0
ProfileClass                    : Display Device Profile
ColorSpaceData                  : RGB
ProfileConnectionSpace          : XYZ
ProfileDateTime                 : 1998:02:09 06:49:00
ProfileFileSignature            : acsp
PrimaryPlatform                 : Microsoft Corporation
CMMFlags                        : Not Embedded, Independent
DeviceManufacturer              : IEC
DeviceModel                     : sRGB
DeviceAttributes                : Reflective, Glossy, Positive, Color
RenderingIntent                 : Perceptual
ConnectionSpaceIlluminant       : 0.9642 1 0.82491
ProfileCreator                  : HP
ProfileID                       : 0
ProfileCopyright                : Copyright (c) 1998 Hewlett-Packard Company
ProfileDescription              : sRGB IEC61966-2.1
MediaWhitePoint                 : 0.95045 1 1.08905
MediaBlackPoint                 : 0 0 0
RedMatrixColumn                 : 0.43607 0.22249 0.01392
GreenMatrixColumn               : 0.38515 0.71687 0.09708
BlueMatrixColumn                : 0.14307 0.06061 0.7141
DeviceMfgDesc                   : IEC http://www.iec.ch
DeviceModelDesc                 : IEC 61966-2.1 Default RGB colour space - sRGB
ViewingCondDesc                 : Reference Viewing Condition in IEC61966-2.1
ViewingCondIlluminant           : 19.6445 20.3718 16.8089
ViewingCondSurround             : 3.92889 4.07439 3.36179
ViewingCondIlluminantType       : D50
Luminance                       : 76.03647 80 87.12462
MeasurementObserver             : CIE 1931
MeasurementBacking              : 0 0 0
MeasurementGeometry             : Unknown
MeasurementFlare                : 0.999%
MeasurementIlluminant           : D65
Technology                      : Cathode Ray Tube Display
RedTRC                          : (Binary data 2060 bytes, use -b option to extract)
GreenTRC                        : (Binary data 2060 bytes, use -b option to extract)
BlueTRC                         : (Binary data 2060 bytes, use -b option to extract)
DCTEncodeVersion                : 100
APP14Flags0                     : [14], Encoded with Blend=1 downsampling
APP14Flags1                     : (none)
ColorTransform                  : YCbCr
ImageWidth                      : 2100
ImageHeight                     : 1400
EncodingProcess                 : Baseline DCT, Huffman coding
BitsPerSample                   : 8
ColorComponents                 : 3
YCbCrSubSampling                : YCbCr4:4:4 (1 1)
DateTimeCreated                 : 2015:04:11 17:13:32
DigitalCreationDateTime         : 2015:04:11 17:13:32
ImageSize                       : 2100x1400
Megapixels                      : 2.9
SubSecCreateDate                : 2015:04:11 17:13:32.03
SubSecDateTimeOriginal          : 2015:04:11 17:13:32.03
ThumbnailImage                  : (Binary data 13912 bytes, use -b option to extract)
MBP:~ sjD$

MBP:~ sjD$ exiftool -o dummy/ '-directory</Volumes/"LaCie 01"/ET2/${Subject;$_=/~ *(.*?) *: *(.*?) *:/ ? qq($1/$2) : undef}' /Volumes/"LaCie 01"/LR2/20150411-171332.jpg
Warning: [minor] Tag 'Subject' not defined - /Volumes/LaCie 01/LR2/20150411-171332.jpg
Warning: No writable tags set from /Volumes/LaCie 01/LR2/20150411-171332.jpg
Error: 'dummy/20150411-171332.jpg' already exists - /Volumes/LaCie 01/LR2/20150411-171332.jpg
    0 image files updated
    1 files weren't updated due to errors


MBP:~ sjD$ exiftool -o dummy/ '-directory</Volumes/"LaCie 01"/ET2/${Keywords;$_=/~ *(.*?) *: *(.*?) *:/ ? qq($1/$2) : undef}' /Volumes/"LaCie 01"/LR2/20150411-171332.jpg
Warning: [minor] Tag 'Keywords' not defined - /Volumes/LaCie 01/LR2/20150411-171332.jpg
Warning: No writable tags set from /Volumes/LaCie 01/LR2/20150411-171332.jpg
Error: 'dummy/20150411-171332.jpg' already exists - /Volumes/LaCie 01/LR2/20150411-171332.jpg
    0 image files updated
    1 files weren't updated due to errors


MBP:~ sjD$ exiftool -o dummy/ '-directory</Volumes/"LaCie 01"/ET2/${HierarchicalSubject;$_=/~ *(.*?) *: *(.*?) *:/ ? qq($1/$2) : undef}' /Volumes/"LaCie 01"/LR2/20150411-171332.jpg
Warning: [minor] Tag 'HierarchicalSubject' not defined - /Volumes/LaCie 01/LR2/20150411-171332.jpg
Warning: No writable tags set from /Volumes/LaCie 01/LR2/20150411-171332.jpg
Error: 'dummy/20150411-171332.jpg' already exists - /Volumes/LaCie 01/LR2/20150411-171332.jpg
    0 image files updated
    1 files weren't updated due to errors
MBP:~ sjD$

Phil Harvey

Hi Steve,

If you take a look at my expression: /~ *(.*?) *: *(.*?) *:/

It is expecting:
"~" - a tilde
" *" - followed by zero or more spaces
"(.*?)" - followed by the shortest string possible (captured as $1)
" *" - followed by zero or more spaces
":" - followed by a colon
" *" - followed by zero or more spaces
"(.*?)" - followed by the shortest string possible (captured as $2)
" *" - followed by zero or more spaces
":" - followed by a colon

And if this expression doesn't match, "undef" is returned.

ExifTool says that Subject is not defined because this expression doesn't match.

If the second colon doesn't always exist, then we need some other way to identify the end of the pet name.  How about this?:

/~ *(.*?) *: *(.*?) *(:|, |$)/

which will stop at ":" or ", " or the end of the string.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

sjDelaney

I'm still chewing on this. Everything you've laid out for me makes sense (IOW I understand at some level), and the flexibility of not finding a second colon is really useful.  But I'm still trying to get it to work.  Trying to understand how my LR generated jpegs could be the problem.  Here's were I am right now...
1.  Took a day or two off to let my brain defog, and attend to life issues (mostly honey do's).
2.  Read some tutorials on exiftool, perl, digital image formats, etc...
3.  Scoured my command line repeatedly for missing or misplaced expressions

My little business really needs a solution ASAP. But I'm trying patiently understand I need to hone my long lost programming skills.  OK, I'll date myself, My background includes Assembly, C, C+, VB, COBOL, CICS, MS DOS, PC DOS, WANG, Burroughs, Unisys, ARPANET, 300 BAUD, and Intel 286, etc, etc, etc...  I've next to nill grasp of Perl, PHP, C#, or the MAC OS I'm working on now. The learning curve is daunting, but "this old dog" is willing to learn new tricks.

Thanks for your help. I'll let you know when I've figured it out.

Phil Harvey

I can date myself too, but you impress me with your COBOL knowledge, and I don't even know what CICS is.

I started with FORTRAN and punchcards on an IBM mainframe around 1977, then BASIC and Z-80 machine code (compiled by hand until I got a compiler) on a TRS-80, BASIC on a Commodore PET, 6502 assembly language on Apple II, C and 8080 assembly on CP/M and MS-DOS, C/C++ on SunOS, Windows and classic Mac, then finally various languages (including Perl) on Linux and OS/X.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

sjDelaney