One to rule them all - Big Photo Metadata Tag and Date cleanup

Started by Kugelblitz, October 24, 2018, 05:40:46 AM

Previous topic - Next topic

Kugelblitz

Hey Phil,

yes you are absolutely right.
I appreciate the time and effort you take to help me.

Long Story Short. It all works except the "remove-geo-name-tags.args"
It shall remove the Geoname "CountryCode, State, Location" from the Tags but they still exist after the process.

-execute
-XMP:Subject-<xmp:CountryCode
-XMP:Subject-<xmp:State
-XMP:Subject-<xmp:Location
-keywords-<iptc:Country-PrimaryLocationCode
-keywords-<iptc:Province-State
-keywords-<iptc:Sub-location
-keywords-<xmp:CountryCode
-keywords-<xmp:State
-keywords-<xmp:Location
-XMP:LastKeywordXMP-<xmp:CountryCode
-XMP:LastKeywordXMP-<xmp:State
-XMP:LastKeywordXMP-<xmp:Location


I checked the Subject Tags and Keywords are correctly separated.
It does not work all together in the args file but each line works if I use them separately as a single command like here.


exiftool.exe "E:\Tests\2018-10-23\2018-10-23 test\" "-keywords-<xmp:CountryCode" -overwrite_original
exiftool.exe "E:\Tests\2018-10-23\2018-10-23 test\" "-XMP:Subject-<xmp:State" -overwrite_original
exiftool.exe "E:\Tests\2018-10-23\2018-10-23 test\" "-XMP:Subject-<xmp:Location" -overwrite_original
...


And I do not have a clue why it works as single commands but not in the args file.

Phil Harvey

Your argfile works for me if I remove the leading -execute.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

Hello Phil,

awesome that it works for you - so it is possible.

It just does not work here for me.

I made a 3min Video to show you all my process and hope you spot where I am wrong.
https://kisd.de/~martinb/foren/exiftool/debugging.mp4

I have also attached all used files to this post.
The Image I tested with is this here
https://kisd.de/~martinb/foren/exiftool/IMG_0800.JPG

Thank you for your patience, time and effort with me. Have donated a little tip for you.

Phil Harvey

Hi Martin,

I need the image you used at the start of your test to be able to reproduce this.  The final image doesn't help.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

Hello Phil,

sorry my fault.

have replaced the image in the link with the "before" image.
https://kisd.de/~martinb/foren/exiftool/IMG_0800.JPG
Thank You

Phil Harvey

Ah, OK.   Here we go:

> exiftool -@ w-separate-tags.args -overwrite_original "IMG_0800.JPG"
Warning: [minor] Tag 'XMP:LastKeywordXMP' not defined - IMG_0800.JPG
    1 image files updated
> exiftool -sep "xx" IMG_0800.JPG -subject -keywords
Subject                         : BELxxBelgienxxOostendexxTurkijenxxVlaanderen
Keywords                        : BELxxBelgienxxOostendexxTurkijenxxVlaanderen
> exiftool -@ remove-geo-name-tags.args -overwrite_original "IMG_0800.JPG"
    1 image files updated
> exiftool -sep "xx" IMG_0800.JPG -subject -keywords
Subject                         : BELxxBelgienxxOostendexxVlaanderen
Keywords                        : BELxxBelgienxxOostendexxVlaanderen
> exiftool -addtagsfromfile @ -@ remove-geo-name-tags.args -overwrite_original "IMG_0800.JPG"
    1 image files updated
> exiftool -sep "xx" IMG_0800.JPG -subject -keywords
Subject                         : BelgienxxOostende
Keywords                        : BelgienxxOostende


I should have thought of this (Note 5 from the -tagsFromFile documentation):

            5) The normal behaviour of copied tags differs from that of
            assigned tags for list-type tags and conditional replacements
            because each copy operation on a tag overrides any previous
            operations.  While this avoids duplicate list items when copying
            groups of tags from a file containing redundant information, it
            also prevents values of different tags from being copied into the
            same list when this is the intent.  So a -addTagsFromFile option
            is provided which allows copying of multiple tags into the same
            list.  eg)

                exiftool -addtagsfromfile @ '-subject<make' '-subject<model' ...

            Similarly, -addTagsFromFile must be used when conditionally
            replacing a tag to prevent overriding earlier conditions.

            Other than these differences, the -tagsFromFile and
            -addTagsFromFile options are equivalent.


The only reason it worked for me earlier is because I happened to test with only XMP:Location, which was the last item removed in your argfile.  Earlier items wouldn't have been removed.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

Thank you Phil, it works
you can skip reading this post and continue with the next post where I write about 4th - FIX DATE that does not work yet.

Everyone else - this is the first part of the Documentation showing what and how it works.

Long Story Short - this is the Command
exiftool -@ w-remove-tags-geolat-geolon.args -@ w-separate-tags.args -execute -addtagsfromfile @ -@ w-remove-geo-name-tags.args -common_args -r -overwrite_original -forcewrite=exif -m -P -L "E:\Tests\2018-10-23\2018-10-23 test"  1>processed_files_list-2018-10-23.txt 2>error_log_2018-10-23.txt

the Command combines two Commands

exiftool -@ w-remove-tags-geolat-geolon.args -@ w-separate-tags.args and exiftool -addtagsfromfile @ -@ w-remove-geo-name-tags.args
one is infront of the execute and one is after the execute.

Both Commands run args files.
The first one "w-remove-tags-geolat-geolon.args" removes Tags in the Subject and Keyword Areas.
The second one "w-separate-tags.args" fixes the Keyword separation.
The third one "w-remove-geo-name-tags.args" removes Keywords that are already present in the Location Area.
I have attached the args files to this Post.

and the "-common_args" shows that everything after it belongs to both commands.
5th some ADDITIONAL COMMANDS THAT I USE
-common_args -r -overwrite_original -forcewrite=exif -m -P -L "E:\Tests\2018-10-23\2018-10-23 test"  1>processed_files_list-2018-10-23.txt 2>error_log_2018-10-23.txt

The Args Files fix the following issues from my first post.

3rd - GEOSETTER - LOCATION DATA
w-remove-tags-geolat-geolon.args
-execute
-subject<${subject@;$_=undef if /^(geo:lat=|geo:lon=|ffffffffffffffff|iPhone|iphone)/}
-Keywords<${Keywords@;$_=undef if /^(geo:lat=|geo:lon=|ffffffffffffffff|iPhone|iphone)/}
-XMP:regionpersondisplayname<${XMP:regionpersondisplayname@;$_=undef if /^(ffffffffffffffff)/}


and

w-remove-geo-name-tags.args
-XMP:Subject-<xmp:CountryCode
-XMP:Subject-<xmp:State
-XMP:Subject-<xmp:Location
-keywords-<iptc:Country-PrimaryLocationCode
-keywords-<iptc:Province-State
-keywords-<iptc:Sub-location
-keywords-<xmp:CountryCode
-keywords-<xmp:State
-keywords-<xmp:Location
-XMP:LastKeywordXMP-<xmp:CountryCode
-XMP:LastKeywordXMP-<xmp:State
-XMP:LastKeywordXMP-<xmp:Location


1st - MAKE THEM EQUAL
separate-tags.args
-execute
-Sep
,
-Keywords<${Keywords;s/\;/\,/g}
-XMP:Subject<${XMP:Subject;s/\;/\,/g}
-XMP:LastKeywordXMP<${XMP:LastKeywordXMP;s/\;/\,/g}


2nd - REMOVE DUPLICATE KEYWORDS
I have no Duplicate Keywords

Kugelblitz

4th - FIX DATE

Most of this is not working yet and I have no clue how to do it right.
Basically I want nested "if this then that" Conditions that run different args depending on the outcome.

Long Story Short

Case A. Check if File has a Takendate
if the file has a Takendate - skip it.
-if
not $DateTimeOriginal


if it has no Takendate Check Case B

Case B.
Check if the time of the FilemodifyDate is equal to 12:00:00
-if
Time of the File eq 12:00:00
-CreateDate<FileModifyDate
-ModifyDate<FileModifyDate

if not Check Case C

Case C.
Check if the Folder name contains the same Date as the FilemodifyDate
-d "%Y-%m-%d"
-if
"$directory !~/^$FileModifyDate/"
-exif:CreateDate<${directory;$_ = /(\d{4}-\d{2}-\d{2})/ ? "$1 12:00:00" : undef}


Case C1
if Yes 
-CreateDate<FileModifyDate
-ModifyDate<FileModifyDate

Case C2
if not
-exif:CreateDate<${directory;$_ = /(\d{4}-\d{2}-\d{2})/ ? "$1 12:00:00" : undef} 

Long Story

Case A. Some of the Dates of the files are messed up by programs that overwrite the Filedate with the current date once they changed the file.
Like Picasa that is excellent in detecting faces and writing them into the metadata of the file.
That is not a big issue if the Files have a Takendate or Created Date - but some Pictures do not have that like Screenshots, Scanned Pictures and such.
So I like to check if the File has no Takendate and no Createdate
If that is the case go to the next condition

Case B.
In some files I have added the "estimated" date manually. For example of Scanned Photos. If that is the case I have put in the Time to noon. 12:00:00
So the second Condition is the Time shall not be equal to 12:00:00

Case C.
Some Files were not edited/Changed by any program and the Filedate is qual to the exact Date and Time they were taken.
To check if that is true I like to compare the Date of the Folder with the Date of the File.
The Foldername is in the sceme "YYYY-MM-DD Location Event" like (2017-12-31 Koblenz New Year)
So if the Filedate is equal to the date in the Folder then the Date and Time is correct so it shall be set as Creatdeate (maybe put it plus minus one day, cause sometimes Pictures that are taken after midnight are in the same folder - and they surley have not the same day. Guess that is difficult with the New year as a picture taken after midnight would have the date 2018-01-01.
Case C1.
So if the Filedate matches the Folderdate take the Filedate as the Takendate and Createdate.
Case C2.
If the Filedate does not match the Folderdate take the Date from the folder and write it to the file with the time 12:00:00 noon. So I know that has been set "manually" to this date.
-exif:CreateDate<${directory;$_ = /(\d{4}-\d{2}-\d{2})/ ? "$1 12:00:00" : undef} 


Phil Harvey

Try this command:

exiftool -@ my.args -common_args -d %Y-%m-%d DIR

and this argfile:

-if
not $datetimeoriginal and $filemodifydate# !~ /12:00:00/ and $directory =~ /$FileModifyDate/
-CreateDate#<FileModifyDate#
-ModifyDate#<FileModifyDate#
-execute
-if
not $datetimeoriginal and $filemodifydate# !~ /12:00:00/ and $directory !~ /$FileModifyDate/
-exif:CreateDate#<${directory;$_ = /(\d{4}-\d{2}-\d{2})/ ? "$1 12:00:00" : undef}


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

Hello Phil,

thanks for your reply. It is a start but not the solution yet.
There are so many Date and Time Fields that makes it quite complicated.

Taken Date = ModifyDate = DateTimeOriginal = Create Date =  DateTimeDigitized = XMP ModifyDate
and
DateTimeCreated = XMP:DateCreated= CreateD Date = IPTC (DateCreated + TimeCreated)

how can I make sure that if I take the Values from one Filed to write to another field are not empty.
So That I do not loose any Data by running

"Field with information"<"Field with no Information"

Phil Harvey

Quote from: Kugelblitz on November 04, 2018, 07:19:03 AM
So That I do not loose any Data by running

"Field with information"<"Field with no Information"

This won't happen.  If the tag on the right doesn't exist, then the argument has no effect.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

or maybe there is a more intelligent approach by comparing all date and time Fields to each other and look for the "oldest" and take that as CreateDate and so on.

Phil Harvey

The best way to do this would be a user-defined Composite tag.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Kugelblitz

Hello Phil,
thank you for the hint with the user-defined Composite tag.

Getting the Dates and times right with the conditions (if not defined / or / and) in several args files was sort of working but not always and got quite messy and time consuming. 
So I do not share the mess here.

Looking for the "user-defined Composite tags" here I found the solution in this post
set all to oldest known date https://exiftool.org/forum/index.php/topic,7986.msg40751.html#msg40751

oldest_datetime_config
%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Composite' => {
        # Select oldest date from a number of date tags
        OldestDateTime => {
            Desire => {
                0 => 'FileModifyDate',
                1 => 'MDItemFSContentChangeDate',
                2 => 'FileCreateDate',
                3 => 'MDItemFSCreationDate',
                4 => 'ModifyDate',
                5 => 'CreateDate',
                6 => 'DateTimeCreated',
                7 => 'DateTimeOriginal',
8 => 'XMP:DateCreated',
9 => 'XMP:DateAcquired',
            },
            ValueConv => q{
                my $oldest = undef;
                for my $date (@val) {
                    next if not defined $date or $date lt '1970:01:02';
                    $date =~ s/[+-]\d{2}:\d{2}$//; # Strip TimeZone
                    if ($date && (!$oldest || $date lt $oldest)) {
                        $oldest = $date;
                    }
                }
                return $oldest;
            },
        },
    },
);

1;


exiftool Command
exiftool -config oldest_datetime_config -filename "-FileModifyDate<OldestDateTime" "-CreateDate<OldestDateTime" "-ModifyDate<OldestDateTime" -@ w-remove-tags-geolat-geolon.args -@ w-separate-tags.args -execute -addtagsfromfile @ -@ w-remove-geo-name-tags.args -common_args -r -overwrite_original -forcewrite=exif -m -P -L -v2 "E:\Tests\2018-10-23\2018-10-23 test"  1>processed_files_list-2018-10-23.txt 2>error_log_2018-10-23.txt

And that works well.
In the Post "set all to oldest known date" the dates "1.1.1970" shall be ignored.
I like to add something "similar" that all dates that have a time set to 12:00:00 noon are the "oldest date". So my manually edited dates are used. 

Guess when that works this Command is ready to run.

Phil Harvey

Sorry for the delay.  Try this:

%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Composite' => {
        # Select oldest date from a number of date tags
        OldestDateTime => {
            Desire => {
                0 => 'FileModifyDate',
                1 => 'MDItemFSContentChangeDate',
                2 => 'FileCreateDate',
                3 => 'MDItemFSCreationDate',
                4 => 'ModifyDate',
                5 => 'CreateDate',
                6 => 'DateTimeCreated',
                7 => 'DateTimeOriginal',
                8 => 'XMP:DateCreated',
                9 => 'XMP:DateAcquired',
            },
            ValueConv => q{
                my $oldest = undef;
                for my $date (@val) {
                    next if not defined $date or $date lt '1970:01:02';
                    $date =~ s/[+-]\d{2}:\d{2}$//; # Strip TimeZone
                    return $date if $date =~ / 12:00:00$/;
                    if ($date && (!$oldest || $date lt $oldest)) {
                        $oldest = $date;
                    }
                }
                return $oldest;
            },
        },
    },
);

1;


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).