Mac/Linux Smart Import From ALL Script

Started by BadlandZ, January 06, 2015, 12:23:11 AM

Previous topic - Next topic

BadlandZ

Apologies if this lands in the wrong section, Mod's please move it if so (but notify me were to find it)

This is PRE-BETA, non-working, just a concept. Do not expect this to HELP anyone restore ANYTHING. It's for developers only. That said, all you need to know to hack away and help develop is a reasonable level of sysadmin skills on POSIX and some basic shell scripting.

Tested on Raspberry Pi Debian and Ubuntu X64

Back Story:

I had all my pix in iPhotos, on a mac external drive formatted as HFS+ encrypted. Playing around with a Raspberry Pi, I accidentally overwrote the partition table, and as a result NO data-recovery apps in Linux or OSX would recover my photos.

But then noticed, searching for .jpg's on my backup devices and my MacBook itself, there were 90% of my photos as email attachments (nice!) or buried for some reason in ~/Libraries on OSX. So I set out to recover them, and ended up starting a very bad Mac/Linux photo recover sysadmin script based on ExifTool.

As I started working on it, I started to realize, it was more than a recovery idea, it was a decent file finding and labeling tool. Not only could I search my Mac, but any HD I could plug in from the past, and sort and find photos.


Summary: It's not done, it's a sad sysadmin script hack that needs to be finished, polished, but in the end could be amazing for so many uses and people. So, I thought I'd share where I am now.


Disclaimer: I searched, but didn't find a script like this.

The find and cut and report commands are working, but I haven't implemented the mv or cp commands yet, because I don't want to overwrite critical data (similar files that could be easily overwritten but are not the same image, maybe need a checksum before write?)

I invite you to try the commands on different OSs and see if they work. For me, as of now, it's just a NOT FUNCTIONAL SCRIPT to test if I can find and rename any and all files on Linux or OSX without overwriting anything.

If there is, I'd rather contribute to an existing one. But since I couldn't find one, he's my crude outline:

#/bin/sh

# Should probably option the user for "mv" or "cp" before doing anything.
#
# And probably prompt the user for both output directory before beginning.

# Now, find the JPG's and make them all .jpg extentions.

#find . -depth -name '*.jpeg' -exec bash -c 'mv -n $0 ${0/jpeg/jpg}' {} \;
#find . -depth -name '*.jpeg' -exec bash -c 'mv -m $0 ${0/jpeg/jpg}' {} \;
find . -depth -name '*.jpeg' -execdir bash -c 'mv -n "$1" "${1//jpeg/jpg}"' _ {} \;
find . -depth -name '*.JPEG' -execdir bash -c 'mv -n "$1" "${1//JPEG/jpg}"' _ {} \;
#find . -depth -name '*.JPG'  -exec bash -c 'mv -n $0 ${0/JPG/jpg}' {} \;
find . -depth -name '*.JPG'  -execdir bash -c 'mv -n "$1" "${1//JPG/jpg}"' _ {} \;

#need modified if diffrent from Original

# Now, clean up weird characters

find . -depth -name '* *'     -execdir bash -c 'mv -n "$1" "${1// /_}"' _ {} \;
find . -depth -name '*,*'     -execdir bash -c 'mv -n "$1" "${1//,/__}"' _ {} \;
find . -depth -name '*(*'     -execdir bash -c 'mv -n "$1" "${1//(/_}"' _ {} \;
find . -depth -name '*)*'     -execdir bash -c 'mv -n "$1" "${1//)/_}"' _ {} \;
find . -depth -name '*@*'     -execdir bash -c 'mv -n "$1" "${1//@/_}"' _ {} \;
find . -depth -name '*&*'     -execdir bash -c 'mv -n "$1" "${1//&/_and_}"' _ {} \;

# For Each Loops

# need to run a if/fi loop to rename files now
# first part is simple and I hope mv -n will leave straglers
# then subloops clean up straglers

# MAIN
# Get the basick file name infor
# for a YYYYMMDD-MAKE-MODEL-WxL.jpg output


# First Part of File Name is DATE Originally Taken
exiftool -p '$DateTimeOriginal' * | sed 's/[: ]//g'

# Second Part of File Name is MFG of Camera
exiftool * | grep Make | grep -v Lens | grep -v Model | cut -d':' -f2

# Third Part of File Name is Camera Model
exiftool * | grep "Camera Model Name" | cut -d':' -f2 | sed -e 's/ //g' | sed -e 's/,/-/g'

# Fourth Part of File Name is Image Size
exiftool -imagesize *

# Now subloop out to find the files that were NOT renamed
# and add to FileName -YYYYDDMM (.jpg) of the modifications

# Fifth Part of File Name is a Fail Safe
#  If all other data matches existing File Name,
#  Then Append File Name with Modification date
exiftool *  | grep File\ Modification\ Date\/Time

# Sixth Part of File Name is Fail Safe
#  If all other data matches existing File Name,
#  Then Append File Name with Last Written Date
exiftool *  | grep Modify\ Date

# Check if Modified
# if Mod is not equal to Original Date, append with mod DATE

# Now that everything is safely named, collect it in a single folder
# with a mv -n command.

# Check if Modified
# if Mod is not equal to Origial Date, append with mod DATE

# Now that everything is safely named, collect it in a single folder
# with a mv -n command.

# Then do a find / | grep -v $TargetFolderName
# and echo "these photos were not fixed and imported"


Phil Harvey

It seems like you may have fallen into this trap.  I haven't gone over your script in detail, but I am fairly certain that this whole thing could be accomplished with a single exiftool command.  This is a common operation for doing things like copying images from memory cards, and looks something like this:

exiftool -r -ext jpg -ext jpeg '-filename<STR' SRCDIR

Read about the -tagsFromFile option for details about the format of STR.  You can use this command to move files and rename them according using the values of extracted tags all in one command.  You will probably want to add a -d option to the command to format any date/time tags in the file name.  Also see this page for more help manipulating files like this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

BadlandZ

#2
I knew there was a reason to post before continuing.  ;) It is helpful, in that will seriously help me condense some of the renaming lines in the loops. Thanks for the pointers, I had seen some of that before, and reviewing it again now.

But, I don't think it will come close to doing what's needed overall, unless there are some major tricks that aren't clearly documented.

For example, I need to be more sure about safety. I don't see any real options there that are anything other than completely dangerous. I mean: /Pictures$ find . -depth -name '*.jpg' | wc
  41705   41705 1239077
Thats a lot of images to mess with in bulk, and a I know for a fact that many of them are pure edits of the originals, where they were cropped, color adjusted, etc. Further, I had run across a two photos that were taken by two different people at the exact same timestamp (I know, what are the odds, huh?).

So, after reviewing that, I decided to start writing the shell script. First because I couldn't  find anything in the exiftool options that was the equivalent of the mv -n or cp -n functionality (nosquash), so using exiftool itself to rename files seemed dangerous to me. Running a renaming command that doesn't have nosquash, bad. And, even if it did, I'd still need it to do something else if there were an existing file with that name. If there is, it'd also be helpful to use in the script. I'll keep reading the docs.

Also, if I try something "simple" like this:exiftool '-FileName<${CreateDate}_${Exif:Model}.jpg' -d %Y%m%d_%H%M%S-%%2c *And run that in a directory with a copy of some pictures to test it, I end up with:  111 image files updated
  355 image files unchanged
So, clearly, that's not a workable option. And, I don't think I see a way, in a single line, to make it act with alternate info when it's not grabbing the file created date, like, if that's not there, first try to use the file modified date, if not there, then just use the file last written date to use in place of the created date, in the event there is no created date.

And there are probably a lot of files that won't have created or modified date in the header, and were bulk written to the drive at the same time (for example, downloads of all my FaceBook photos that I lost and want a copy again). All of those are going to have the exact same fallback date, and all want to be named the exact same thing. Not good.

Plus, when two files have the same creation date, and are in fact variations of the same picture, I want the oldest to use the creation date, the newer one to ALSO use the creation date, but append the file name in a way I know it's a modified copy, like YYYYMMDDHHMMSS-CameraModel-Moded-YYYYMMDDHHMMSS.jpg, because I'm sure I'm going to need that to happen to a lot of photos. And as mentioned, just writing them with creation date will end up erasing a huge number of pictures that were just modified copies of the original, and to me that's both dangerous in file loss, and not helping me identify the files by name.

I may have to make the script create a database (flat text file probably) from a first pass reading information first, then refer to it to know what is the original file, and what is a modified one, and rename them accordingly. Still thinking it through in my head right now though, haven't done it yet.

You're right, I did fall into the above mentioned trap! It appears you've pointed me at shortcuts to use as commands in a script. But for safety and full functionality, it still looks like it needs to be a script, AFAIK? I'll keep thinking... Maybe I could just use all 3 dates for safety reasons and do it all in a single line like you suggest, but it would be unnecessarily long file name for 90% of the files, and still, there doesn't seem to be any -n safety in place?

Phil Harvey

Quote from: BadlandZ on January 07, 2015, 02:47:47 AM
First because I couldn't  find anything in the exiftool options that was the equivalent of the mv -n or cp -n functionality (nosquash)

You didn't find this option because ExifTool will never overwrite an existing file.

If you want to do a dry run, write the TestName tag instead of FileName.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

BadlandZ

#4
Quote from: Phil Harvey on January 07, 2015, 07:40:41 AM
You didn't find this option because ExifTool will never overwrite an existing file.
Oh! YES! Thanks! That's really great to know! That should make it easy to loop and write names without having to create a database! If it doesn't change the name on the first pass, then try again with the secondary date option, then third, etc... That will save me a lot of work.   :)

It is so nice to have a forum for a tool that's actually actively monitored by the creator! I appreciate all your help, thank you, and great program.

The more I work with it, the closer to a 1 line solution I'm getting. However, since the dates aren't always there, I might need to loop still. Right now I'm playing with stuff like this:exiftool -directory=changed '-FileName<${CreateDate}_${Exif:Make}_${Exif:Model}_CHANGED.%e' -d %Y%m%d%H%M%S-%%2c-%%c_%%f *