Can I extract the date from the directory name and use it in an exiftool command

Started by ceej, February 02, 2023, 08:18:09 PM

Previous topic - Next topic

ceej

I had many slides scanned.  They all have a filename of the form: aa_nnnn-mmmm.jpg
They were scanned in groups; each group had a number, date, and description, e.g. bb_kkkk_datestring_desc

The files are stored on my computer in directories with the group name.
The datestring in the directory has at least yyyymm, but sometimes has more (yyyymm-mm, or yyyymmdd or yyyymmdd-dd, or yyyymma, yyyymmb, etc.)

None of the files has a DateTimeOriginal.

Is there a way, using exiftool, that I can create a date-time-string from the directory name (i.e. yyyymmdd 00:00:00, or yyyymm01 00:00:00 if dd is not present in the directory name) and then set the DateTimeOriginal tag to that date-time-string for each file in the directory?

And if I can do that, is there a way to do it recursively?   

Finally, a would-be-nice, can I capture the group (bb_kkkk) and use it to set some other as-yet-to-be decided tag?

Thanks.

Phil Harvey

The variation in directory names is a problem, and will require a bit of thinking.

What do you mean by yyyymma and yyyymmb?  (what are "a" and "b")

I'm thinking that yyyymm-mm gives a range of months, and that everything after the "-" should be ignored?

Also, to help here we will need a complete list of the possible directory name formats.  We will need an algorithm that reliably pulls out all of the date elements from the directory name.  Also, a few actual examples would be useful.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

ceej

Sorry, sometimes (frequently) I give more info than is necessary.

All directory names begin with aa_nnnn_

What follows next is a datestring, which can be any of the following:
a.    A specific year and month, yyyymm
b.    A repeated year month, yyyymmx (where x is a,b,c, ...)
c.    A year and range of months, yyyymm-mm
d.    A year month day, yyyymmdd
e.    A repeated day, yyyymmddx (where x is a,b,c, ...)
f.    A range of days, yyyymmdd-dd

Following the datestring can be (I did not realize that I was not consistent):
    an underscore (_) 80+%
    a hyphen (-) 10+%
    a blank ( ) <5%
NOTE - Since there are only a few, I can, if it would make processing easier, change the datestring ending hyphens and blanks to underscore so all datestrings would terminate the same.
   
Following the datestring may be a description-string or nothing.

Examples:
Sc_0113_198601-02_pinewood-matt-bday
Sc_0114_198603-04_little_league_party
Sc_0115_198604_disneyland_05_swimming
Sc_0116_198604-05-wonderful-machines
Sc_0117_198606a_Boston
Sc_0118_198606b-Boston
Sc_0119_198606c Boston
Sc_0120_19860712-19_farm-camp-misc
Sc_0121_19860820-22-sportcamp_10-10Krun
Sc_0123_198610-12_halloween_christmas
Sc_0124_19861215_karen
Sc-0125-19861231 new years eve

For directories that do not identify a single day, I have no algorithm for determining what day to use for any file in the directory.  Therefore, I am willing to use the first/earliest most complete date from the directory name as I can get.  Later, if it is important, I can change dates of groups of files by incrementing the day number.

Therefore, for the various datestring formats above, I would be happy to get the following date values:
a. yyyymm01
b. yyyymm01 (for each x)
c. yyyymm01
d. yyyymmdd
e. yyyymmdd
f. yyyymmdd

The time value for all would be 00:00:00
I have an exiftool command which will increment the time value by 10 seconds from one file to the next, in filename order. 

The end result will be that the files in a directory will have a datetimecreated that is at the beginning of the time range for which I know the images to have been taken, and when ordered by datetimecreated, will show the files in their proper order.

I hope this is clearer and helps.
thanks,
  -ceej

StarGeek

I haven't tested it (sorry, I'm not setting up a whole bunch of temp directories), but try this
exiftool "-DateTimeOriginal<${Directory;$_=(split '\/',$_)[-1];m(\d{6})(\d\d)?/;$_=$1.($2?$2:'01')} 00:00:00" /path/to/files/

Breakdown
1) the directory path is split on the slashes, then the default value is set to the last value in the array
$_=(split '\/',$_)[-1]
2) Match 1 is the captured value of six consecutive digits, followed by optional match 2 of two more digits.  Because at least six digits are captured, the earlier four digit sequence is skipped
m(\d{6})(\d\d)?/;
3) The default value is set to the first six digits, then a conditional operation where if $2 exists, that is added to the string, otherwise 01 is added
$_=$1.($2?$2:'01')

Then the trailing 0s for the time.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

StarGeek

And I forgot to add, you cannot use the current directory (i.e. use just a dot .) as the directory input for exiftool to read.  Uou have to either use a full path or CD to the parent directory of the one you want to process.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype


ceej

OK, I was finally able to try a test and here are the results:

here is the directory of directories:
Directory of T:\__NOT__CARBONITE__\TESTING\PSE\1980s\1989

02/07/2023  08:06 PM    <DIR>          Sc_0142_198901-03_annonuevo-memday
02/07/2023  08:06 PM    <DIR>          Sc_0143_198904-06_yosemite_matt-ball_Karen
02/07/2023  08:06 PM    <DIR>          Sc_0144_198906-08_fairmeadow-grad_farm
02/07/2023  08:06 PM    <DIR>          Sc_0146_198909-11_halloween-soccer
02/07/2023  08:06 PM    <DIR>          Sc_0147_198912_christmas

Here is the command that I used to run against just the first directory of files:

T:\__NOT__CARBONITE__\TESTING\PSE\1980s\1989>exiftool "-DateTimeOriginal<${Directory;$_=(split '\/',$_)[-1];m(\d{6})(\d\d)?/;$_=$1.($2?$2:'01')} 00:00:00" ./Sc_0142_198901-03_annonuevo-memday


And here are the results for the first three files:
Warning: Search pattern not terminated for 'Directory' - ./Sc_0142_198901-03_annonuevo-memday/SC_7265-7294.jpg
Warning: No writable tags set from ./Sc_0142_198901-03_annonuevo-memday/SC_7265-7294.jpg
Warning: Search pattern not terminated for 'Directory' - ./Sc_0142_198901-03_annonuevo-memday/SC_7266-7295.jpg
Warning: No writable tags set from ./Sc_0142_198901-03_annonuevo-memday/SC_7266-7295.jpg
Warning: Search pattern not terminated for 'Directory' - ./Sc_0142_198901-03_annonuevo-memday/SC_7267-7296.jpg
Warning: No writable tags set from ./Sc_0142_198901-03_annonuevo-memday/SC_7267-7296.jpg

I then created a Test-folder and copied into it the folder Sc_0142_198901-03_annonuevo-memday so it was the only folder. 

 Directory of T:\__NOT__CARBONITE__\TESTING\PSE\1980s\1989\Test-folder

02/09/2023  10:42 AM    <DIR>          .
02/09/2023  10:42 AM    <DIR>          ..
02/09/2023  10:42 AM    <DIR>          Sc_0142_198901-03_annonuevo-memday

Then I cd'ed to Test-folder and ran this command (using -r and .)
exiftool -r "-DateTimeOriginal<${Directory;$_=(split '\/',$_)[-1];m(\d{6})(\d\d)?/;$_=$1.($2?$2:'01')} 00:00:00" .
and received the same output:
Warning: Search pattern not terminated for 'Directory' - ./Sc_0142_198901-03_annonuevo-memday/SC_7265-7294.jpg
Warning: No writable tags set from ./Sc_0142_198901-03_annonuevo-memday/SC_7265-7294.jpg
Warning: Search pattern not terminated for 'Directory' - ./Sc_0142_198901-03_annonuevo-memday/SC_7266-7295.jpg
Warning: No writable tags set from ./Sc_0142_198901-03_annonuevo-memday/SC_7266-7295.jpg
Warning: Search pattern not terminated for 'Directory' - ./Sc_0142_198901-03_annonuevo-memday/SC_7267-7296.jpg
Warning: No writable tags set from ./Sc_0142_198901-03_annonuevo-memday/SC_7267-7296.jpg

I can 'understand' your regex based upon your description, but not sufficiently to even attempt to try to debug/edit it.  Any suggestions?
  -ceej

Phil Harvey

In your expression,

m(\d{6})(\d\d)?/

should be

m/(\d{6})(\d\d)?/

StarGeek was missing a "/" in his command.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Yep, technically it's not Search pattern not terminated, it was never started.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

ceej

Yes, Yes - i actually 'knew' that!  (the m/ bit) But could not see it.
I'm off to referee a set of soccer games and will be home late.
So I'll get to this tomorrow and report back.

PS
I see uses of filenumber and filesequence in various examples, but I do not find them in the main exiftool document.
Are they explained somewhere?
Do they only 'count' the files that are actually processed?  That is, they do not count files excluded by -if and any other such filters?
I'd like to try to use the appropriate one to set/increment the time value by 10 seconds for each file.

StarGeek

Quote from: ceej on February 09, 2023, 05:22:15 PMI see uses of filenumber and filesequence in various examples, but I do not find them in the main exiftool document.
Are they explained somewhere?

FileSequence can be found on the Extra Tags page

FileNumber is a Composite tag which means it created on the fly based upon other tags in the file.

QuoteDo they only 'count' the files that are actually processed?  That is, they do not count files excluded by -if and any other such filters?

From the notes on FileSequence
      sequence number for each source file when extracting or copying information, including files that fail the -if condition of the command-line application, beginning at 0 for the first file
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

ceej

Here are the previously existing commands that I have used to zero and increment the time in DateTimeOriginal
exiftool -overwrite_original -DateTimeOriginal"<${DateTimeOriginal;s/ .*//}000000" .
exiftool -overwrite_original -fileOrder FileName "-DateTimeOriginal+<0:0:${filesequence}0" .


So I thought I'd try using the filesequence to do something similar with the command StarGeek provided
exiftool -if "substr($filename,0,3) eq 'SC_'" "-DateTimeOriginal<${Directory;$_=(split '\/',$_)[-1];m/(\d{6})(\d\d)?/;$_=$1.($2?$2:'01')} 00:00:${filesequence}0" .
That command, run on this directory (/Sc_0142_198901-03_annonuevo-memday), changed all of the DateTimeOriginal values to 1/1/1989 12:00:00 a.m.
And to 12/25/1989 on a test on this directory: /Sc_0147_19891225_christmas
This is all great.

But my attempt to use filesequence did not work as I'd hoped.

When I run this version of my earlier command on the output of the above tests (adding -ext jpg to skip the .original files)
exiftool -r -progress -overwrite_original -fileOrder FileName "-DateTimeOriginal+<0:0:${filesequence}0" -ext jpg .
then the times are incremented by 10 seconds as expected.

SO, ...
Is there a way for me to include the 10-second-increment-time-value in the command provided by StarGeek so that I only have to make one pass through each directory?

AND, (assuming I can)...
As I understand it, filesequence includes files skipped via an -if.
So if I use the command recursively (-r), and if there are files in the directories which fail my -if "name starts with SC_", then my time value will have gaps for all those files not processed.  But all sets of SC_ files that are processed will have increasing time values.  So I should only need to worry about processing more files at once than seconds-in-a-day/10 = 8640, yes?

PS
I obliviously do not know the implications of this:
Not generated unless specifically requested or the API RequestAll option is set

Phil Harvey

Quote from: ceej on February 10, 2023, 03:24:30 PMI obliviously do not know the implications of this:
Not generated unless specifically requested or the API RequestAll option is set

You specified it explicitly in your -if condition, so you don't need to worry about this.

However, you don't want to use FileSequence anyway because you want to count the processed files.  This is tricky, but you do do it by using this expression instead of ${filesequence}

${filename;$_=$Image::ExifTool::myVar || ($Image::ExifTool::myVar=0);++$Image::ExifTool::myVar}

Doing this in the same command as when you copy Directory to DateTimeOriginal would be tricky.  What you tried should work in theory, but would be limited to 6 files before you would get an invalid number of seconds in the time value you are writing.  I think 2 passes is easiest.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

ceej

Phil,
Sometimes I get something stuck in my head and just can't let it go.
But this time I am going to take your advice.
I have reverted to StarGeek's formula with the m/ and my -if.
It runs successfully on a directory of 5 directories with 115 files.
I then run my 10-sec-increment command and it works as always.
The last file time is 12:19:00 a.m. which is (115-1)/6=19:00 minutes exactly.
So this should do just what I want.
Since the times are arbitrary and set this way to insure that photos with the same date appear in the correct order, regardless of whether by file name or date taken, I do not care if the increasing time value 'jumps' across dates.  I will just make sure that it does not wrap - not a hard problem.
I need to verify it with all directory name forms, but I'm thinking it is going to work.
As usual, I'll be back with questions.
But my sincere thanks to you and StarGeek for your help.
  -ceej