130k files to tag. Best practice?

Started by mrXif-geoTagger, September 20, 2019, 02:45:59 PM

Previous topic - Next topic

mrXif-geoTagger

Been searching high and low in the forum and am unable to conclude on best practice. Let me explain what I'm trying to do and what my current assumptions are.
At the end I would like your view and advice!

How I roll:
1) I have a fairly large photo collection (130.000+) photos.
2) I am currently scanning all my paper copies and negatives (10.000+). Oldest photos are from around 1900.
3) I live in Norway and current OS is Windows 10 (codepage/charset issues)
4) Main collection is on harddrive (backups on NAS)
5) Main frontend (FE) is Google Photos (GP) (high quality/free option) (supports who, when, where, +GP specifics: colors, objects, etc)
6) Shares/receives alot of photos on Facebook/Messenger and GP

Have lived so long that I know that my current FE will change in a few years when the next best thing comes along. Want to write tags once.
Have also lived so long that I accept that EXIFtool will probably stay with us for a long time. That's good, love the thing :- )

My objectives are:
a) In FE/OS: All dates should be the date when photo was taken (main rule)
b) File should contain correct geotag
c) Strings written in Norwegian should be stored/retrieved in/from the file/tag
d) I will tag the files with "the 4Ws": Who's in the photo, where/when/why was it taken
e) Files/tags should be readable for all OS/FE (that supports the standars)

My current implementation based on my assumptions (correct me if I'm wrong):
a) All dates are written like: see example below
b) Geotags are written outside EXIFtool and EXIFtool only copies from other files (if I specify)
c) My current codepage/language settings still gives me warnings although strings are correctly written (-"Warning: FileName encoding not specified...")
d) Who and why (event) is covered in filename only. Orig filename is stored in tag: see example below
e) I adhere to limits (file/folder length) highlighted here: https://en.wikipedia.org/wiki/Filename#Comparison_of_filename_limitations

Based on this, my batch file looks like this (the exiftool command is spread over separate lines for clarity. In reality it's all in one line):
@echo off
chcp 1252
exiftool
  -FileModifyDate="1989:09:10 15:15:15 +01:00"
  -FileCreateDate="1989:09:10 15:15:15 +01:00"
  -ExifIFD:DateTimeOriginal="1989:09:10 15:15:15 +01:00"
  -UserComment"<$Filename"
  -ImageDescription"<$Filename"
  -artist="Artist Name"
  -comment="Comment#1 ÆØÅ"
  -charset cp1252
  -overwrite_original
  "C:\path1 with spaces\file1.jpg"

I don't want to continue tagging and then somewhere down the road, I realize I made a bad decisions somewhere. Can you spot the errors? Will I meet my objectives? Can I do something better/simpler? Thank you for your patience and input...

Alan Clifford

As far as I am aware, the exif datetimeoriginal doesn't have a timezone.  Without checking, I don't know if your command would automatically put the timezone in the OffsetTimeOriginal tag.

StarGeek

I'm going to start off by saying I can't be of much help when it comes to codepage/charset issues except to point you to FAQ #18 and FAQ #10.  I don't use characters other than basic ones in my file names and I ended up so frustrated by Windows handling of character sets that I usually copy the data I want to embed to the clipboard and run a short perl script that converts non-ascii characters to HTML entities for use with the -E (escapeHTML) option.

Also, the following is about image files.  Video practices would be different.

Quote from: mrXif-geoTagger on September 20, 2019, 02:45:59 PM
5) Main frontend (FE) is Google Photos (GP) (high quality/free option) (supports who, when, where, +GP specifics: colors, objects, etc)

Just a reminder, the free option will recompress large images.  I haven't looked into what the actual details are currently but it's something to keep in mind if you want your original images to remain unaltered.  I do believe that it's an extremely good algorithm, though.

Quote
a) All dates are written like: see example below
...
  -FileModifyDate="1989:09:10 15:15:15 +01:00"
  -FileCreateDate="1989:09:10 15:15:15 +01:00"

One thing to remember is that these items are metadata about the File, not the Image.  It's not logical, for example, to set the date of a jpeg file to something before 1992, as jpegs didn't exist at that time.  Additionly, the FileModifyDate will be altered by the OS anytime the file is opened for a write operation, unless the program doing so prevents this (see exiftool's -P (preserve) option as an example).

An addtional problem, when it comes to exiftool, is that the Perl date time routines that exiftool relies on will fail if you try to set these file system timestamps to a date between January 1, 1900 and December 31, 1969.  Most often ending up with a date 100 years or more after the time set.  There is no real workaround for this.

Some low end DAM (Digital Asset Management) programs will depend upon this data, but better quality ones should only fall back on these if there isn't anything else.  As a best practice, IMO, is to make sure and set the embedded timestamps properly and not worry about the system timestamps unless absolutely necessary.

Quote-ExifIFD:DateTimeOriginal="1989:09:10 15:15:15 +01:00"

The best practice when it comes to exiftool, IMO, is to not be specific when it comes to the group unless you absolutely need to be.  Using DateTimeOriginal as the example, you give the proper place in your command, but what if there's also an XMP:DateTimeOriginal set.  The above command will set only the EXIF tag, not the XMP tag and you'll end up with data out of sync.  Also, what if you weren't sure of the proper group for the tag and set it incorrectly?  Other programs might not find the data.  Using -DateTimeOriginal="1989:09:10 15:15:15" will set the tag in the proper place and update tags with the same name but different groups as needed.

With regard to the timezone as Alan mentioned, EXIF timestamps don't contain the timezone.  It's held in a different tag, OffsetTimeOriginal.  So if you wanted to set both, the command would be -DateTimeOriginal="1989:09:10 15:15:15" -OffsetTimeOriginal=+01:00 though if your process was the result of some scripting, you wouldn't have to try and separate the two, you could just use
-DateTimeOriginal="1989:09:10 15:15:15+01:00" -OffsetTimeOriginal="1989:09:10 15:15:15+01:00"
and exiftool would figure it out.  Or even easier, use the SubSecDateTimeOriginal tag
-SubSecDateTimeOriginal="1989:09:10 15:15:15+01:00"
and it will write both.  (Only just now saw that it was a writable tag, going to have to go over my workflow AGAIN :D) Additionally, if you decided to use XMP tags, the XMP:DateTimeOriginal tag can include the timezone, so it would be set with -XMP:DateTimeOriginal="1989:09:10 15:15:15+01:00".

Quoteb) Geotags are written outside EXIFtool and EXIFtool only copies from other files (if I specify)

The only thing to remember with writing geotags is that there are 4 tags two write.  The obvious GPSLatitude/GPSLongitude and then the directional reference tags, GPSLatitudeRef/GPSLongitudeRef.  The latter two are especially necessary in cases where the GPS coordinates are negative, e.g. Western and/or Southern hemisphere.

Quoted)<...> Orig filename is stored in tag: see example below
...
  -UserComment"<$Filename"
  -ImageDescription"<$Filename"

UserComment is a rarely used tag.  Placing the filename there is reasonably safe, as it's unlikely to be altered by most programs.  But it's also unlikely to be displayed by most programs. ImageDescription is a bit more problematic.  According to the Metadata Working Group, it's supposed to be kept in sync with IPTC:Caption-Abstract and XMP:Description.  This set of tags contains "A textual description, including captions, of the image."  While most programs won't directly read or edit ImageDescription, they will change it when the description is changed.  Lightroom, for example, will set all three at once.

There are two alternatives specifically set up for holding the original filename.  The first alternative are the IPTC:ObjectName and XMP:Title.  According to the spec, these are "A shorthand reference for the digital image" with a notation that "Many use the Title field to store the filename of the image".

One problem though, is that Adobe has decide it's going to do something different.  It's uses these tags to hold a short description of the file, basically a synopsis  of the Description.  So instead, they use XMP:PreservedFileName for the purpose of holding the original filename.

Basically, is you plan on possibly using Adobe products in the future, you might want to use XMP:PreservedFileName.  If not, use the either of other two (or both).  Or use all three for maximum compatibility .

Quote-artist="Artist Name"

Related tags for this are IPTC:By-line and XMP:Creator.  But Artist has pretty good support among most programs.

Quote-comment="Comment#1 ÆØÅ"

The Comment tag is very fragile and should not be depended upon for preserving data.  It is, for the most part, a jpeg tag that isn't part of the EXIF, IPTC, or XMP file groups.  Many older programs will overwrite this, often with the name of the program.

The proper place, IMO, would be to write to XMP:Description, IPTC:Caption-Abstract and/or EXIF:ImageDescription.

With regards to writing data with exiftool, one really good starting point is to use the MWG tags.  They'll write to the more common tags as a group and do some other bookkeeping for you.  Using your example, instead of writing

  -ExifIFD:DateTimeOriginal="1989:09:10 15:15:15 +01:00"
  -UserComment"<$Filename"
  -ImageDescription"<$Filename"
  -artist="Artist Name"
  -comment="Comment#1 ÆØÅ"


you could use
-MWG:DateTimeOriginal="1989:09:10 15:15:15 +01:00"
-MWG:Description="Comment#1 ÆØÅ"
-MWG:Creator="Artist Name"

and all the appropriate tags would be filled and be well protected against future front end changes.

If you want to do further reseach, you can check out the IPTC Photo Metadata Standard 2017.1.  Also, the Metadata Working Group has a standard for reconciling differences in data between the groups but their website has been down for many months at this point.  You can find a copy of the pdf attached to this post.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

StarGeek

Quote from: Alan Clifford on September 20, 2019, 03:21:30 PM
As far as I am aware, the exif datetimeoriginal doesn't have a timezone.  Without checking, I don't know if your command would automatically put the timezone in the OffsetTimeOriginal tag.

It doesn't.  But while typing up that long post, I just discovered that SubSecDateTimeOriginal is writable and will write to DateTimeOriginal and OffsetTimeOriginal at the same time.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

mrXif-geoTagger

Hey hey hey! Thank you guys. Thank you so much for taking your time to inform me. Great read. I'll retract to the lab and do some more research based on your feedback. I will return with my new discoveries and you can again comment it!!!

BTW, sorry, forgot:
I'm running Windows 10,
wants to tag jpgs and
am running ExifTool Version Number 11.62.
Use GeoSetter 3.4.82 for geotagging.
Use a self-developed frontend to exiftool (VB6).  :P

StarGeek

Quote from: mrXif-geoTagger on September 20, 2019, 04:45:22 PM
Use GeoSetter 3.4.82 for geotagging.

In case you haven't used it lately, there was some big problems with it due to changes in Google Maps last year.  If you found the procedure to get it working again, then don't worry about it.

But GeoSetter is also a very good GUI for setting various data in the file.  If you click the "Edit location and other data..." button, a window with a bunch of tabs will pop up and let you set a lot of the data I mentioned.  And GeoSetter will properly set the values in the proper places across multiple tags.  It's worth checking out.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

mrXif-geoTagger

Tried the fix, didn't get it to work. You can live with the bug, but switched view to OpenStreetMap (is an option) and it seems to work. Works for me. Only use it to find new locations. Everything else I use EXIFtool for (reusing already geotagget photos). Thanx though.

Alan Clifford

Quote from: StarGeek on September 20, 2019, 04:32:38 PM
Quote from: Alan Clifford on September 20, 2019, 03:21:30 PM
As far as I am aware, the exif datetimeoriginal doesn't have a timezone.  Without checking, I don't know if your command would automatically put the timezone in the OffsetTimeOriginal tag.

It doesn't.  But while typing up that long post, I just discovered that SubSecDateTimeOriginal is writable and will write to DateTimeOriginal and OffsetTimeOriginal at the same time.

I've stored that bit of information for future use.

mrXif-geoTagger

Just quickly in. Testing on images from an iPhone 8 Plus and have adjusted my script a bit according to StarGeek's tips. Current command is:
exiftool
-SubSecDateTimeOriginal="2019:07:17 15:15:15 +01:00"
-XMP:PreservedFileName"<$Filename"
-MWG:Creator="LeCreator"
-MWG:Description="theDescription"
"C:\path\file.jpg"


I'm currently very satisfied with the tips on PreservedFileName, Creator and Description. Probably not surprisingly it's the time stamp that's causes most challenges. I've stopped pursuing the file's time stamp as StarGeek's suggested, an am now focusing on dates in tags. A challenge when starting to mess with the time stamps is as I understand it that they can come out of synch.

My main purpose of altering the time stamps is that I want it to reflect when the photo was taken. It's not important when the photo was enhanced in software like Photoshop or when it was digitized/scanned.

The command above modified two out of 11 dates and time stamps found when running the following command before and after the altering:
exiftool -c "%.6f" -a -u -g1 file.jpg

These were updated:
ExifIFD Date/Time Original
Composite Date/Time Original


These were not updated:
IFD0:Modify Date
ExifIFD:Create Date
GPS:GPS Date Stamp
GPS:GPS Time Stamp
XMPxmp:Modify Date
XMPxmp:Create Date
XMPphotoshop:Date Created
Composite:Create Date
Composite:GPS Date/Time


To be concise, and not set anything out of synch, which dates should I update and should I leave any as they are?

StarGeek

Quote from: mrXif-geoTagger on September 22, 2019, 06:49:18 PM
My main purpose of altering the time stamps is that I want it to reflect when the photo was taken. It's not important when the photo was enhanced in software like Photoshop or when it was digitized/scanned.
<...>
These were updated:

Composite Date/Time Original
<...>
Composite:Create Date
Composite:GPS Date/Time

These can be ignored.  They're not tags that are embedded in the file.  Instead, they're tags exiftool derives from other tags in the file.  They're basically shortcuts for displaying or copying useful information.  See the Composite tags page for more details.

QuoteExifIFD Date/Time Original

This is your most important tag.  It designates the date/time that the image was created/captured, whether it was on film or digitally.  As previously mentioned, the related timezone tag is OffsetTimeOriginal.

Related is the XMP-photoshop:DateCreated, which was not changed.  My apologies for not bring up the MWG version to use.  I was just a bit excited about learning about the SubSecDateTimeOriginal tag.

Instead of
-SubSecDateTimeOriginal="2019:07:17 15:15:15 +01:00"
try using
-MWG:DateTimeOriginal="2019:07:17 15:15:15 +01:00"
This will set all tags that indicate when the image was created.

One helpful hint, if you want to look at only the time related tags, you can use this command
exiftool -time:all -g1 -a -s <FileOrDir>

Quote
IFD0:Modify Date
XMPxmp:Modify Date

These tags indicate the last time the image was modified.  As you mentioned, enhanced in Photoshop.  The related time zone tag is OffsetTime.  Programs such as Photoshop/Lightroom should take care of these tags for you (though I'm not sure about the OffsetTime, maybe in newer versions).  The best tag to use if you wish to edit these would be MWG:ModifyDate.

QuoteXMPxmp:Create Date
ExifIFD:Create Date

These tags indicate when the digital file was created.  The related time zone tag is OffsetTimeDigitized.  In the case of a digital camera, this time would be the same as DateTimeOriginal.  If the image was scanned from a film photo, then the time should technically be the time it was scanned, but many people don't do that.  If you want to edit this then the best tag would be MWG:CreateDate

QuoteGPS:GPS Date Stamp
GPS:GPS Time Stamp

GPS time stamps should be in UTC.  For most cases, you can just set them the same as the other tags and exiftool will do the calculations to UTC for you.  For example (notice I changed your previous time to a quarter past midnight):
-GPSTimeStamp="2019:07:17 00:15:15 +01:00" -GPSDateStamp="2019:07:17 00:15:15 +01:00"
One reason to give the time on GPSDateStamp is when the time zone might roll back/forward a day.  In this example, 2019:07:17 gets rolled back to 2019:07:16.

Unfortunately, there is a problem which you will probably encounter.  This suffers from the same problem as setting the system file dates.  Setting GPSDateStamp to 1959:07:17 15:15:15+01:00 and you'll end up with 2059:07:17 14:15:15Z.  There are two ways around this.  You can just set the date without the time info.  For example
-GPSDateStamp="1959:07:17"
but then you have to watch out for times where the timezone would cause the date to change and adjust it yourself.  The second option would be to run a second command to subtract 100 years from the tag
exiftool -GPSDateStamp-="100:0:0 0"

* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

mrXif-geoTagger

#10
Thanks again, StarGeek. I'll read this carefully. Meanwhile I ended up with the following (and was satisfied with the outcome. My front end currently warns that I should geotag PRIOR to setting any dates*...):
exiftool
  -charset cp1252
  -allDates="2015:05:15 15:15:15 +01:00"
  -GPSDateStamp="2015:05:15 15:15:15 +01:00"
  -GPSTimeStamp="2015:05:15 15:15:15 +01:00"
  -XMP:PreservedFileName"<$Filename"
  -MWG:Creator="DaVinci"
  -MWG:Description="The Vitruvian MÆN"
  "iphone8s.JPG"


(*: Copying the geotag from other files also copies the GPS dates...). I'm running this command:
exiftool -tagsFromFile "C:\fileWgeotag.jpg" -gps:all "C:\fileWOgeotag.jpg"

Could i overwrite the date from fileWgeotag.jpg with another date in one go?