Parsing the image filename to extract the date

Started by mikeowaterloo, September 07, 2024, 01:18:34 PM

Previous topic - Next topic

mikeowaterloo

I have a number of jpgs - some have good EXIF dates, others do not. Some have filenames that reflect the date others do not (e.g. IMG 2354).

I know you can set the EXIF dates from the filename using e.g.

exiftool "-datetimeoriginal<filename" DIR

But I don't want to run that on a bunch of files because in many cases the filename does not contain the date or has a bad date.

What I would like to do is to create a text file with
EXIFDate Date_extracted_from_filename

It's not printing a tag (filename) per se, but rather printing the filename converted to a date. So I can check it manually and then use EXIFtool to rename only subset.

Is that possible?

StarGeek

How many digits do you expect to find in the filenames for valid entries? All 14 for Year>Second?  Or less than that?

Assuming all 14 numbers, here's a messy command that check for (semi)valid dates from 1980-2025. By semi, I mean it isn't making sure that the date isn't February 31, any valid day from 01-31 will be accepted.
exiftool -if "${Filename;tr/0-9//cd;}=~m/(19[89][0-9]|20[01][0-9]|202[0-5])(0[1-9]|1[0-2])(0[1-9]|[12][0-9]|3[01])(0[0-9]|1[0-9]|2[0-3])([0-5][0-9]){2}/" -p "$Filename - $DateTimeOriginal - ${Filename;tr/0-9//cd;s/(19[89][0-9]|20[01][0-9]|202[0-5])(0[1-9]|1[0-2])(0[1-9]|[12][0-9]|3[01])(0[0-9]|1[0-9]|2[0-3])([0-5][0-9])([0-5][0-9]).*/$1:$2:$3 $4:$5:$6/}" /path/to/files/

Example. Only two files have digits that could be a date/time. The output is
Filename - DateTimeOriginal - Date/time taken from filename

C:\>exiftool -G1 -a -s -filename -DateTimeOriginal Y:\!temp\x\y\2
======== Y:/!temp/x/y/2/103a.jpg
[System]        FileName                        : 103a.jpg
[ExifIFD]       DateTimeOriginal                : 2024:09:04 16:25:07
======== Y:/!temp/x/y/2/2024-09-04 RenameTest000323 more digits 11243.jpg
[System]        FileName                        : 2024-09-04 RenameTest000323 more digits 11243.jpg
[ExifIFD]       DateTimeOriginal                : 2024:09:04 11:02:17
======== Y:/!temp/x/y/2/20240906 121212.jpg
[System]        FileName                        : 20240906 121212.jpg
[ExifIFD]       DateTimeOriginal                : 2024:09:07 11:49:22
======== Y:/!temp/x/y/2/test3.jpg
[System]        FileName                        : test3.jpg
[ExifIFD]       DateTimeOriginal                : 2024:09:04 11:02:17
    1 directories scanned
    4 image files read

C:\>exiftool -if "${Filename;tr/0-9//cd;}=~m/(19[89][0-9]|20[01][0-9]|202[0-5])(0[1-9]|1[0-2])(0[1-9]|[12][0-9]|3[01])(0[0-9]|1[0-9]|2[0-3])([0-5][0-9])([0-5][0-9])/" -p "$Filename - $DateTimeOriginal - ${Filename;tr/0-9//cd;s/(19[89][0-9]|20[01][0-9]|202[0-5])(0[1-9]|1[0-2])(0[1-9]|[12][0-9]|3[01])(0[0-9]|1[0-9]|2[0-3])([0-5][0-9])([0-5][0-9]).*/$1:$2:$3 $4:$5:$6/}" Y:\!temp\x\y\2
2024-09-04 RenameTest000323 more digits 11243.jpg - 2024:09:04 11:02:17 - 2024:09:04 00:03:23
20240906 121212.jpg - 2024:09:07 11:49:22 - 2024:09:06 12:12:12
    1 directories scanned
    2 files failed condition
    2 image files read

I used this AI RegEx generator to create the RegEx. It's probably just an interface to ChatGPT or something.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

mikeowaterloo

Thanks StarGeek.

My files are actually have a range of structures. They could be

2004.02.18-08.30.00.jpg
2004 02 18 08.30.00.jpg
2004-02-18-08.30.00.jpg
 
So that presumably makes the regex even more complicated.

I was hoping to avoid writing long regex and use the built-in functionality that does this parsing.

 

mikeowaterloo

The ideal would be something like the testname feature, but in reverse, i.e. go from filename to CreateDate rather than the other way around.

exiftool -d %Y%m%d_%H%M%%-c.%%e "-testname<CreateDate" DIR
The TestName tag is used for dry-run testing of the file renaming feature. The above command is identical to that of the next example except that TestName is written instead of FileName. So instead of renaming the files, this command prints the old and new file names without actually changing anything. For example:
> exiftool -d %Y%m%d_%H%M%%-c.%%e "-testname<CreateDate" tmp
'tmp/a.jpg' --> 'tmp/20031031_1544.jpg'
'tmp/b.jpg' --> 'tmp/20010519_1836.jpg'
    1 directories scanned
    0 image files updated
    2 image files unchanged

StarGeek

Quote from: mikeowaterloo on September 17, 2024, 10:09:44 AMMy files are actually have a range of structures. They could be

2004.02.18-08.30.00.jpg
2004 02 18 08.30.00.jpg
2004-02-18-08.30.00.jpg

Those would be covered by the command in your first post. There would be no need to use any RegEx.

My command ignores any characters except for the numbers, so the formatting would be ignored. It removes all non-digit characters and then compares the first 14 numbers to see if it is possibly a valid date/time. So the resulting comparison in these examples would be to check against "20040218083000" and nothing else.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).