Trying to edit a large number of files quickly...

Started by Fox, September 28, 2012, 11:47:09 AM

Previous topic - Next topic

Fox

exiftool looks like it will be able to do what I'm interested in and I don't mind trying to do it myself although if I could avoid writing code that would be preferred. I'm reasonably comfortable with perl, but all the work I've done with it has been in the unix world and all my photos live on my windows 7 pc.

I was using Sony's PMB to manage my photo library. I like it when I modify 'date taken' info or add rating/tags to the files, it never edits the original file and just stores the info in a .modd file. So writing the pictures back to DVD for storage or to give to family/friends etc...that is all lost. In fact I recently formatted and reinstalled and the tag information got messed up (dates/ratings seem to be ok still) I had to reinstall PMB to see it. (Sony is trying to get people to move to PlayMemories but I did not like that as the tagging features and importing features seem bugged, its ok as a viewer, the problem might be updates to the modd files in the new tool).

All I'm interested in doing is:
1. parsing the modd files through a directory structure ie:
Pictures/pic1.modd
Pictures/pic2.modd
Pictures/Subdir/pic3.modd
etc...
2. getting the tag, date taken and rating data from those files
3. writing it back to the jpg exif/xmp data directly:
Pictures/pic1.jpg
Pictures/pic2.jpg
Pictures/Subdir/pic3.jpg
etc..

The modd files looked sorta like xml and are ascii and should be easy to parse. If I were in unix I don't think it would take me very long to do this assuming I could figure out the command line to use exiftool to write the data back, but on windows...I'm totally lost.

Is this something that is reasonable to do? I don't mind trying to do it myself if needed, but if someone else can (or has) done something like this I'd happily defer.

Thanks in advance. (if someone was willing to take a stab at it I could definitely provide example modd files for parsing - otherwise I might just need help getting perl to work on windows and possibly invoking the exiftool).

Phil Harvey

The Sony .modd files are very similar to the Apple .plist format.  ExifTool will read them as plain XML, but the result isn't very pleasing.  It has been on my to-do list for a while to add support for these flavours of XML, but I haven't gotten around to it yet because I hate XML so much. :P

Your project is certainly do-able, and the biggest problem is parsing the XML files.  The rest is simple using the ExifTool API.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

Well I have a wrapper script written in perl that does what I specified, I'd be happy to send it to you if you like. I was able to test it on a small sample and it worked!

So this is what I have:

  • It globs through a parent directory and all subdirectories looking for .modd files.
  • If a .modd file exists it looks for files with .JPG and .jpg extensions (this is a simple array in the script so really easy to add more).
    if neither are found it reports a message saying it can't it and won't do anything.
  • It can optionally rename <file>.JPG.modd to <file>.modd (because it would look for <file>.JPG.JPG otherwise) haven't tested reading this back into PMB yet, not sure why some have this name structure either.
  • It needs a path to FiltreTree.xml to cross reference the tags since it shows up like "{07C7076C-A0CC-4C00-AD19-D648B04A3E30}" instead of "Christmas"
  • Once it has the information it passes what it found (for date, labels or rating) to exiftool and edits the file it found.
  • One Caveat: I don't understand the date encoding used. The XML file show this:
<key>DateTimeOriginal</key><real>31064.884537037036000</real>

    the real is the date value I think, should be January 17th 1985 for this example, I thought it might be epoch time, but the first number is too small and the 2nd much to large. [/li]

I only tested it on linux, I tried to use the file system insensitive perl modules to do this, I'm not a programmer, still pretty clueless how to set it up for windows. Was hoping you could take what I've done and incorporate it (and review for anything stupid since I'm working with a small sample) Its about 168 lines of code now, not sure if you have any ideas about the date encoding or know where I could look. I tried a few things in google but didn't turn up much. I can try to email you the code I have though or attach it here or just put it in a post if you like.

Thanks.

Phil Harvey

Actually, I did some more checking after posting this but didn't update this thread.

ExifTool actually does process plist files inside other (zip-based) formats already.  I think maybe the problem with adding plist support was more in recognizing the plist file (ExifTool doesn't use the extension to recognize the file format), but I still need to spend some more time looking at this.  I have moved this up in my to-do list.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

I don't know anything about the plist files, assuming they're similar to the modd files though could probably but udpated to handle both (I'm sure my script scould) are you interested in looking at my wrapper script?

Any ideas about the date encoding? (if its something exiftool can figure out on its own without me decoding, all the better just not sure how).

Phil Harvey

The date/time looks like number of days since 1900.  I don't think looking at your code will help because I would use my XMP processor to parse this XML anyway.  But thanks for the offer.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

Actually I think its number of days since 12/30/1899 (weird I know) using Date:calc I was 2 days off, I manually checked and my results matched Date::Calc as well.

The latter part seems to be the % of day that's past. Noon is .5 exactly. So I think I'm handling that ok.

I think I have everything working in my wrapper script just need to install perl at home, then I'll try testing it on windows.  just curious if you've made any progress on exifTool reading these?


Phil Harvey

Sorry, no progress yet.  I still have the feeling that I tried this before and failed because I couldn't find a good way to identify a plist/modd file.

12/31/1899 would make sense because I believe the Julian day 1 is on Jan 1, but 12/30 I don't understand.  Perhaps because their code puts and extra leap day in 1900 where it doesn't belong.  (I think this leap day is the reason that Apple uses 1904 as the base date instead of 1900.)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

My best guess is maybe because 12/31/1899 was a Sunday and they wanted that to be #1. It doesn't figure 1900 as a leap year I checked, either way my algorithm works (seems to round up by 1 second, which I'm sure I could fix, but honestly don't care).

As for finding the files I just do this (the sub routine calls itself and just works recursively through directories and assumes they just named .modd):
#this function only gets .modd files, find those first then look for jpgs to edit.
sub getModds {
  my $root_dir = $_[0];

  my @all_files = <$root_dir/*>;
  foreach (@all_files) {
     if (-d $_) {
        &getModds($_);
     } elsif ($_ =~ /.modd$/i) {
        push @moddFiles, $_;
     }
  }
 
} # end getModds

Fox

Just curious if you've made any progress on this? I finally sorted all my files and am ready to back them up. I should be able to figure out how to get perl installed on my PC and use my wrapper script, but if I can just get a new version of exiftool soon to save myself the trouble I'll take it.  :)

Phil Harvey

I'm sorry, no progress yet.

I guess my dislike of XML is getting the better of me. :(

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

#11
I have implemented parsing of PLIST and MODD XML, and testing is going well so far, so this feature should appear when ExifTool 9.16 is released (hopefully tomorrow).

There is a problem with the MODD dates though.  The PLIST format uses a "date" element for date/time values.  Recognizing this is easy, and I am doing this automatically.  Unfortunately though, the MODD files seem to use a "real" element to store dates, which sort of sucks.  The only thing I can do here is to recognize specific tag names, which isn't ideal.  So far I only have the DateTimeOriginal tag.  MODD also uses ASCII-hex encoding instead of the PLIST-mandated Base64 encoding for "data" elements, which sucks too, but I can work around this using a simple heuristic.

- Phil

Edit:  I have implemented the date conversion and tested it with the samples I have handy.  With my samples, it is the "real" number of days since December 31, 1899 (not Dec 30 as you found).  I don't know why the difference.  As I mentioned, this date makes more sense anyway, so I'll code it this way.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

Hey great thank you  :)

I don't see how I'm supposed to use it however?

I finally did get my wrapper script all working on windows and found this when I was about to upload it to google code or something and post instructions for using it.

Phil Harvey

How you use it depends on what you want to do.  This is the basic usage:

> Image-ExifTool-9.16//exiftool ../pics/20081227133915.modd
ExifTool Version Number         : 9.16
File Name                       : 20081227133915.modd
Directory                       : ../pics
File Size                       : 2.6 kB
File Modification Date/Time     : 2009:01:27 08:21:08-05:00
File Access Date/Time           : 2013:02:07 14:50:33-05:00
File Inode Change Date/Time     : 2012:06:13 14:10:11-04:00
File Permissions                : rw-r--r--
File Type                       : PLIST
MIME Type                       : application/xml
AVCHD Division Info List Clip Information File: 48444d5630313030000000dc000000f60000014c000002d4000002d8000000000000000000000000000000b00000010100000000002255100006f6000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001e8048444d56000000000000000000000000000000000000000000000000000000001600010000000001001001000000000000b6d0001c908000000052000100000000010003001011151b43300000000000000000000000000000000000001100158161756e640000000000000000000000000000000012001590756e640000000000000000000000000000000000000001840001000110110004001c004e0000000e0000003c0000000000000004000540020001d53d0005c0020002034b000ac0040003cdfe000b8004000411c4001080060005dedb0011000600060e3a2188000432400add32f6214e33ae374e34644e76351a656535d27c90368893a03740aa3f37f6c0f838acd8bc3964eef83a1b058a3ad31d133b8934223c3f4b043cf762363dad78ce3e658fd73f1ba6513fd3bd2b2089d53d213fec6521f6034b22ac1a4f236430fc241a482924d05fdf258876e5263e8e1626f6a4c027acbb572862d2e2291ae8be29d0ff713a89160a2b3f2d953bf544743cad5bfc3d63727f3e1b89403ed1a0453f87b714303fcdfe30f5e4d221adfbb0326211c433182a1233d041142486581e253e6eb435f4854536ac9c402762b3f32818ca6b28d0e1842986f8f32a3f10c22af527772bab3e9f3c63544c3d196b912dd184103e8799a13f3db08b3ff5c7f530abdedb2163f7ac22180e3a22ce24e223863b31243c525e34f4685a35aa7f1f366096703718af4727cec7402886dcc8000000000000005a0000001800000001100001000000001800000046434c45580000000000000030000000000000000000000000000000000000000000000000000000000000000401088003000000120001010010110b1b01640028000000000000
AVCHD Division Info List File Size: 87588864
Check Code                      : EBFDFA1A
Date/Time Original              : 2008:12:27 13:39:15
Duration                        : 40.559999999999995
Favorite List                   :
File Size                       : 87588864
Geolocation Latitude            : 53.807949166666660
Geolocation Longitude           : -3.056602500000000
Geolocation Map Datum           : WGS-84
Label List                      : {AB68DF41-628D-449B-BC4F-7A6E50C2FDB4}
XML File Type                   : ModdXML


Note that the clip information file will be extracted as binary data in the next release (version 9.17).

In your original post you wanted to copy the date/time original from these files to the corresponding JPEG:

exiftool -tagsfromfile %d%f.modd -datetimeoriginal -ext jpg DIR

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

Thanks that seems to be working. At least nominally.

What about the rating (generally 1 to 5) or the tags? I tried adding -xmp-xmp:Rating and -xmp-dc:subject to the command line but it didn't do anything. It did correctly update the date for the files that had the date.

1. is it possible to have it go recursively? It only worked on the main directory. (the routine I posted above will work recursively)
2. not sure how it happened i have a bunch of .modd files named basename.JPG.modd instead of basename.modd and it can't parse those it says "Warning: Error opening file - c:/path/to/basename.modd" in my code I just gave it a -fixModd flag that fixed the file names to basename.modd not sure if you could have it either read those files or an option to rename them maybe? I certainly didn't create them by hand, no idea how they got that way.
3. I don't see a rating in your code above
4. I do see a tag, but its encoded I know in PMB/Playmemories there's a FilterTree.xml which you can xref the {AB68DF41-628D-449B-BC4F-7A6E50C2FDB4} to something in english, I assuming even if I got that tag to work it would probably just put in that ugly string instead of something I wanted it to be say "Vacations" or whatever.

Unrelated to this I was trying to pass the date in and I had the date in the wrong format I originally had -DateTimeOriginal"YYYY,M,DD, HH,MM,SS' then "YYYY:M:DD HH:MM:SS" once I changed it to MM it worked, but I only got the error message when I only tried to edit the date, if I tried to do one of the 2 tags above with it, it would update the file with that information, but not the date and did not print any error message that I could find. (so I guess what I'm saying is if you just pass it 1 bad set of inputs exiftool gives you a nice error message but if you pass a good flag and a bad flag it updates the file will the good flag and ignores the bad flag and no error/warning message - would be nice to get some sort of warning with the bad flag)

Phil Harvey

Quote from: Fox on February 07, 2013, 05:18:41 PM
What about the rating (generally 1 to 5) or the tags? I tried adding -xmp-xmp:Rating and -xmp-dc:subject to the command line but it didn't do anything. It did correctly update the date for the files that had the date.

Where is the rating stored.  You can copy it from anywhere to anywhere you want.  See the -tagsFromFile option documentation for more information.

Quote1. is it possible to have it go recursively? It only worked on the main directory. (the routine I posted above will work recursively)

Yes.  Add the -r option.

Quote2. not sure how it happened i have a bunch of .modd files named basename.JPG.modd instead of basename.modd and it can't parse those it says "Warning: Error opening file - c:/path/to/basename.modd" in my code I just gave it a -fixModd flag that fixed the file names to basename.modd not sure if you could have it either read those files or an option to rename them maybe? I certainly didn't create them by hand, no idea how they got that way.

You can process these with another command.  See the -tagsfromfile documentation.

QuoteUnrelated to this I was trying to pass the date in and I had the date in the wrong format I originally had -DateTimeOriginal"YYYY,M,DD, HH,MM,SS' then "YYYY:M:DD HH:MM:SS" once I changed it to MM it worked, but I only got the error message when I only tried to edit the date, if I tried to do one of the 2 tags above with it, it would update the file with that information, but not the date and did not print any error message that I could find. (so I guess what I'm saying is if you just pass it 1 bad set of inputs exiftool gives you a nice error message but if you pass a good flag and a bad flag it updates the file will the good flag and ignores the bad flag and no error/warning message - would be nice to get some sort of warning with the bad flag)

Try adding -v2 if you want more messages. :)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

Ok I read through the documentation a couple times.

The -r works great thank you!  :)

Changing the %d%f.modd to %d%f.jpg.modd got the extra files but that seems to need 2 passes (not a big deal, not sure if there's a wild card syntax that would work?) (I tried %d%f[.jpg]?.modd (and a couple variants) but none seemed to work. I think [.jpg]? gives 0 or 1 instances of the .jpg string for perl regex.

Not specifying any tags picked up all the ratings too, but not the labels any ideas there? (although without the reference tabels I'm guessing I'd get than encoded string).

regarding the -v2 that definitely gave error messages, so handy to know. I did notice some weird behavior when  I passed in some string it put nothing in other strings it put the wrong date in. -DateTimeOriginal"1977,4,10 01,01,01" came out as 10/1/1977 which i found to be strange (I am passing the correct date now, so its not an issue to me just letting you know it seems to parse the date incorrect sometimes).

Thanks so much for your help and supporting this tool.  :)

Phil Harvey

#17
Quote from: Fox on February 08, 2013, 10:26:21 AM
Changing the %d%f.modd to %d%f.jpg.modd got the extra files but that seems to need 2 passes (not a big deal, not sure if there's a wild card syntax that would work?)

No, 2 passes are required here.

QuoteNot specifying any tags picked up all the ratings too, but not the labels any ideas there? (although without the reference tabels I'm guessing I'd get than encoded string).

I can't help much without even knowing the tag names.  If you posted your MODD file it would help.

QuoteI did notice some weird behavior when  I passed in some string it put nothing in other strings it put the wrong date in. -DateTimeOriginal"1977,4,10 01,01,01" came out as 10/1/1977

Yes.  The date/time parsing will ignore an entry with only one digit.  The reason is that ExifTool will accept dates like 19770410010101, so it needs to look for a specific number of digits in each field.  This is explained in FAQ number 5.

QuoteThanks so much for your help and supporting this tool.  :)

You're welcome. :)

- Phil

Edit:  I just thought of a technique that I could use for date/time parsing that would support single-digit values as well as undelimited date/time strings.  I'll test it and if things work out maybe the next ExifTool version will be able to handle these.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

My modd file is this (its all on one line I tried to make it a little more readable here):

<?xml version="1.0" encoding="utf-8"?>
<plist version="1.0">
<dict>
<key>MetaDataList</key>
<array>
<dict>
<key>DateTimeOriginal</key>
<real>31064.884537037036000</real>
<key>FavoriteList</key>
<array/>
<key>LabelList</key>
<array>
<string>{5A932D96-EE43-4618-8EE8-DA9F5AFC3079}</string>
<string>{5802F031-759E-4DEB-AFED-CECAA7FCC926}</string>
</array>
<key>Rating</key>
<integer>3</integer>
</dict>
</array>
<key>XMLFileType</key>
<string>ModdXML</string>
</dict>
</plist>


I have a file called FilterTree.xml which has this in it:

<dict>
<key>DisplayName</key>
<string>Family</string>
<key>IconId</key>
<integer>131329</integer>
<key>IdName</key>
<string>{5A932D96-EE43-4618-8EE8-DA9F5AFC3079}</string>
<key>IdNumber</key>
<integer>102</integer>
<key>_Children</key>
<array>
<dict>
<key>DisplayName</key>
<string>Steve</string>
<key>IconId</key>
<integer>131331</integer>
<key>IdName</key>
<string>{5802F031-759E-4DEB-AFED-CECAA7FCC926}</string>
<key>IdNumber</key>
<integer>104</integer>
<key>_Children</key><array/>
</dict>


The whole file is much larger but the 5A code should map to "Family" and the 58 code should map to Steve. (basically mapping the IdName to the DisplayName.) I'm guessing you'd need the xref file for this.

Phil Harvey

OK, so ExifTool extracts this from your MODD file:

> exiftool ~/Desktop/t.modd -plist:all
Date/Time Original              : 1985:01:17 21:13:44
Favorite List                   :
Label List                      : {5A932D96-EE43-4618-8EE8-DA9F5AFC3079}, {5802F031-759E-4DEB-AFED-CECAA7FCC926}
Rating                          : 3
XML File Type                   : ModdXML


The DateTimeOrginal and Rating look useful, but as you said the LabelList contains useless GUIDs.  ExifTool won't look up these GUID's up for you in FilterTree.xml.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Fox

Ok I finally got everything working on my PC at home and updated my whole library correctly.

I'll just post an idiots guide to using this in case anyone anywhere is trying to do the exact same thing.

1. Download perl onto your system, recommend strawberry perl: http://learn.perl.org/installing/windows.html
2. get Date::Calc after doing the above from the command prompt type "cpan Date::Calc"
3. get exiftool install in C:\exiftool http://www.exiftool.org/
4. download getTagInfo.pl (wrapper for above looking for executable in C:\exiftool\exiftool )
http://code.google.com/p/gettaginfo-exiftool/downloads/detail?name=getTagInfo.pl&can=2&q=
Recommend installing in exiftool folder.
5. cd to folder with getTagInfo.pl  in it (cd C:\exiftool)
6. open command prompt type: "perl getTagInfo.pl -startDir C:\path\to\pics -filterTree C:\path\to\FilterTree.xml" can optionally add:

-fixFileExt if you have extra .jpg. in your file names (any file type it will change, but meant for jpg or modd files to reduce names, can cause issues if you have jpg in you basename as well as extra .jpg. in your extentions. ie <file>JPG.modd instead of <file>.modd 

-owOrg file to not get _original versions of all your pictures (recommend to back them up in a different directory structure/hard drive before running this).   

-debug will print more info to the log file (log file should appear in run directory getTagInfo.log)

Put any paths with spaces in double quotes.

The exact command I ran, and depending on user ID and whether you put your pictures in a public/person space on your windows  7 pc is:
perl getTagInfo.pl -startDir C:\Users\Public\Pictures -filterTree "C:\Users\Steve\My Documents\Sony PMB\FilterTree.xml" -fixFileExt -owOrg -debug


Additionally I only set this up to work on jpg files. So if you have gifs or png or something it will skip those. It reports the # of modd files found and the #of files edited. These should match. I had 10 more modd files than files edited but this was due to a couple of modd files that had no picture associated with them (no idea how this happened - possibly got manually renamed/deleted). and it skipped the handful of non-jpg files. It took about 2 hrs to update 13630 files on my PC.

If anyone out there is interested in using it my email is in the comments at the top of the program. I highly recommend the debug mode although it can generate a rather large text file (it will generate about 4-5 lines per picture - so mine was ~68k lines and 5MB of text) but I would need it to debug if any thing does not appear to work. Please back up your files before use.