News:

2023-03-15 Major improvements to the new Geolocation feature

Main Menu

Request for Input on Another Google Photos "Takeout" Instance

Started by elopsentorml, November 28, 2019, 06:53:38 PM

Previous topic - Next topic

elopsentorml

Sorry for another of these Google Photos "Takeout" posts, but I've been going over everything I can find here and was hoping someone could confirm that the ExifTool commands I'm going to use look reasonable.  As a note, right now, all my photo "Takeout" jpg/json files are under "D:\GooglePhotos" on my Windows 10 machine.

First, while doing some experimentation with these, I saw many (426) problems with tags that shouldn't be in my jpg files (all I have are jpgs). I ran a command I found here to find those bad files (sorry, I didn't keep track of the originator):

exiftool  -r  -Exif:StripOffsets -Exif:Compression -Exif:RowsPerStrip -Exif:StripByteCounts -Exif:Software -if '$EXIF:StripOffsets or $Exif:RowsPerStrip or $Exif:StripByteCounts' -ext jpg  -ext jpeg -ext tif -ext bmp -ext jpg_original -ext jpeg_original -ext tif_original -ext bmp_original "D:\GooglePhotos"

EDIT:  Corrected wrong apostrophes and added other file types, above.

As I mentioned, that gives me a lot of problem files spread all over the place.  If I run the command found here:

https://exiftool.org/faq.html#Q20

to fix "unsafe" data against the whole "D:\GooglePhotos" directory:

exiftool -r -all= -tagsfromfile @ -all:all -unsafe -icc_profile -ext jpg  -ext jpeg -ext tif -ext bmp -ext jpg_original -ext jpeg_original -ext tif_original -ext bmp_original "D:\GooglePhotos"

EDIT:  Added -r (recurse flag) and -ext stuff (to restrict to just photo files) to above command.  BTW, trying to run this command in Powershell results in an error at the "@".  Maybe that's fixable, but I just made sure to run it in a standard cmd shell.

will that strip out stuff that Google added (like people/place tags) or just tags that ExifTool knows don't belong in jpgs?  Also, does that command recurse through all the sub-directories?  Is there a better way to do this?  Perhaps some kind of piping?

Second, once the bad tags are gone, I need to merge the json data into the corresponding jpgs.  Looking at what's in one of those jsons, I get:

D:\GooglePhotos\Purchase Stuff> exiftool -s img_20191023_114419.jpg.json
ExifToolVersion                 : 11.72
FileName                        : img_20191023_114419.jpg.json
Directory                       : .
FileSize                        : 871 bytes
FileModifyDate                  : 2019:10:31 15:44:12-07:00
FileAccessDate                  : 2019:11:28 09:21:30-08:00
FileCreateDate                  : 2019:11:28 09:21:30-08:00
FilePermissions                 : rw-rw-rw-
FileType                        : JSON
FileTypeExtension               : json
MIMEType                        : application/json
CreationTimeFormatted           : Oct 25, 2019, 1:59:20 AM UTC
CreationTimeTimestamp           : 1571968760
Description                     : [redacted]
GeoDataAltitude                 : [redacted]
GeoDataLatitude                 : [redacted]
GeoDataLatitudeSpan             : 0.0
GeoDataLongitude                : [redacted]
GeoDataLongitudeSpan            : 0.0
GeoDataExifAltitude             : [redacted]
GeoDataExifLatitude             : [redacted]
GeoDataExifLatitudeSpan         : 0.0
GeoDataExifLongitude            : [redacted]
GeoDataExifLongitudeSpan        : 0.0
ImageViews                      : 0
ModificationTimeFormatted       : Oct 31, 2019, 9:44:13 PM UTC
ModificationTimeTimestamp       : 1572558253
PhotoTakenTimeFormatted         : Oct 23, 2019, 6:44:19 PM UTC
PhotoTakenTimeTimestamp         : 1571856259
Title                           : IMG_20191023_114419.jpg


And, looking here:

https://stackoverflow.com/questions/42024255/bulk-join-json-with-jpg-from-google-takeout

it looks like I'll need to merge in the Description and GPS stuff from the json files.  From other comments on this forum, I think I need to merge in the photo taken time.  Assuming I can do all of that at once, this is what I've come up with (once with a %F for json files with jpg exensions and once with a %f for those without):

exiftool -r -wm cg -tagsfromfile "%d/%F.json" "-Caption-Abstract<Description" "-ImageDescription<Description" -Description "-GPSAltitude<GeoDataAltitude" "-GPSLatitude<GeoDataLatitude" "-GPSLatitudeRef<GeoDataLatitude" "-GPSLongitude<GeoDataLongitude" "-GPSLongitudeRef<GeoDataLongitude" -ext jpg  -ext jpeg -ext tif -ext bmp -ext jpg_original -ext jpeg_original -ext tif_original -ext bmp_original-overwrite_original "D:\GooglePhotos"

EDIT:  Removed "-DateTimeOriginal<PhotoTakenTimeTimestamp", added -ext stuff for other file types (not sure how those ???_original ones will work) and added -wm cg (create new tags but don't edit existing ones) from above command.

The above seems to have done something (and that's where I noticed all the bad file tags).  But, when I ran the %f version for json files without the jpg extension, I got a boatload of "Error Opening File" messages against various json files.  Right now, I'm not too worried about those because it's possible my earlier activities messed something up.

Anyway, the final thing I need to do is re-organize these files and eliminate duplicates (since, apparently, when I added things like people/places tags in Google Photos it decided to give me duplicate photos in each tag-based folder).  I'm having trouble deciding on an organizational structure, but I guess I'll just go with putting all the photos in their Year taken folder.  From things I've found around here, that looks like:

exiftool -r -d %Y/%%f%%-c.%%e "-FileName<DateTimeOriginal" -ext jpg  -ext jpeg -ext tif -ext bmp -ext jpg_original -ext jpeg_original -ext tif_original -ext bmp_original "D:\GooglePhotos"

But, I'm not sure what that structure will look like.  A Year folder with all the files named after when they were taken with any duplicates including a suffix before the extension?  If so, hopefully each jpg will contain all people/place tags and I'll be able to manually delete all but one of any duplicates.  Unless there's a better way to do that?

Again, I'm sorry for bringing up another "Takeout" set of problems.  But, I'd really appreciate any input.  Thanks.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

StarGeek

Quote from: elopsentorml on November 28, 2019, 06:53:38 PM
As I mentioned, that gives me 426 problem files spread all over the place.  If I run the command found here:

https://exiftool.org/faq.html#Q20

to fix "unsafe" data against the whole "D:\GooglePhotos" directory:

exiftool -all= -tagsfromfile @ -all:all -unsafe -icc_profile "D:\GooglePhotos"

will that strip out stuff that Google added (like people/place tags) or just tags that ExifTool knows don't belong in jpgs?

It's been a while since I've tested Google takeout, but at that time Google did not alter the metadata of the files. Anything extra will only appear in the JSON file.  That command will keep all tags that exiftool has definitions for. Unless you created you're own tags and added them to the files to begin with, it's most likely that nothing will be lost except possibly data that was corrupted in some way.

QuoteAlso, does that command recurse through all the sub-directories?

Add the -r (recurse) option to recurse into subdirectories.  You might also want to include the -ext (extension) option (i.e. -ext jpg) to prevent exiftool from trying to process the json files.

QuoteAnd, looking here:

https://stackoverflow.com/questions/42024255/bulk-join-json-with-jpg-from-google-takeout

Hmmm... coming up on 2 years since I posted that.  I probably should double check it.

Quoteit looks like I'll need to merge in the Description and GPS stuff from the json files.  From other comments on this forum, I think I need to merge in the photo taken time.

You only will need to merge if the data has been changed in Google website.  Things like time stamps when the photo was taken probably don't need to be changed.  Also, something I failed to mention in that answer, you can't directly copy from "PhotoTakenTimeTimestamp" into a image tags such as DateTimeOriginal.  See FAQ #5 for details on how exiftool needs the time stamp to be formatted.  To translate to something exiftool can copy, the format needs to be set with the -d (dateFormat) option (see Common Date Format Codes for the list of formatting codes).

Quote(once with a %F for json files with jpg exensions and once with a %f for those without):
...
But, when I ran the %f version for json files without the jpg extension, I got a boatload of "Error Opening File" messages against various json files. 

That probably isn't something to worry about.  That just means that most of your files don't have json files that don't include the ".jpg".

Quoteexiftool -r -tagsfromfile "%d/%F.json" "-DateTimeOriginal<PhotoTakenTimeTimestamp" "-Caption-Abstract<Description" "-ImageDescription<Description" -Description "-GPSAltitude<GeoDataAltitude" "-GPSLatitude<GeoDataLatitude" "-GPSLatitudeRef<GeoDataLatitude" "-GPSLongitude<GeoDataLongitude" "-GPSLongitudeRef<GeoDataLongitude" -ext jpg -overwrite_original "D:\GooglePhotos"

Except for the previously mentioned time stamp problem, that probably should be ok.

QuoteFrom things I've found around here, that looks like:

exiftool -r -d %Y/%%f%%-c.%%e "-FileName<DateTimeOriginal" -ext jpg "D:\GooglePhotos"

But, I'm not sure what that structure will look like.  A Year folder with all the files named after when they were taken with any duplicates including a suffix before the extension?  If so, hopefully each jpg will contain all people/place tags and I'll be able to manually delete all but one of any duplicates.

That command will separate the files by year (the %Y, see the above link for date formatting codes).  It keeps the original file name and extension (%f and %e, with the % signs doubled because it's in a -d formatting string), but will add a counter number if it encounters a duplicate file name.  See -w (textout) option for more details on the %f, %c, and %e codes.

Quotehopefully each jpg will contain all people/place tags and I'll be able to manually delete all but one of any duplicates

That looks like it might be a problem.  None of that data appears in the JSON output you listed and, as I mentioned, Google didn't change any data in the original file at the time I tested.

QuoteAgain, I'm sorry for bringing up another "Takeout" set of problems.  But, I'd really appreciate any input.  Thanks.

Never a problem.  It gives me a reminder to double check things to see if anything has changed.  Plus
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

elopsentorml

Thanks for the reply.  I'll make those changes you mentioned and see what happens.  Thankfully, if I break anything, I can just start over with the original data Google sent me.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

elopsentorml

Digging around in these Takeout files and sampling some more of them, it seems like the json files with GPS data are associated with jpgs that have it.  I don't know what I was looking at where that didn't seem to be the case.  So, it might be that the only thing I need to merge from the json to the jpg is the Description.  Also, it looks like the jpgs that have people/place tags are old enough that I must have added them there with earlier software (and there's nothing in the jsons).  So, I don't know if I'll be able to recover those.  Well, I suppose I could somehow parse the folder names (which are people/place tags) and write them into the jpgs someplace.  This is getting to be a bigger mess every time I look at it.

Nice of Google to give us our data back all in pieces with no way (or only a very convoluted way) to get it all back together.  I don't know how Joe-Bag-of-Donuts is supposed to do this.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

StarGeek

One additional suggestion would be to add -wm cg to your command (see -wm (writemode) option).  This will tell exiftool to write only new information and not to overwrite any data already in the file.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

elopsentorml

#5
Aargh.  The forum timed me out while I was responding, so I lost everything.  I'll try to recreate it:

1.  I've edited my original post with your suggestions.  Hopefully, this thread will help future Google Photos refugees.
2.  After running through those commands, it looks like they worked (at a cursory level, so far).
3.  Since most of the files have been moved out of the original Takeout locations, I now see that I've got some .jpeg files left behind (I thought they were all .jpg files).  I'll modify the commands and go back to handle those.  EDIT:  and .tif and .bmp.
4.  There are also some ".jpg_original" files left behind.  I'm guessing those were the originals of files I edited on Google Photos and the edited versions are what got moved.  I'll have to decide what to do with those.  EDIT:  I added -ext stuff for the ???_original types in all my commands, above.  I'm not sure if those will work with the merge one.  I'm pretty sure it will in the %f version.
5.  The files got moved into their Year folders and the duplicates are there with their suffixes.  I still don't know how I'm going to get the people/places tags into those (since they're only reflected in the names of the Takeout folders they came from).  It'd be a shame to lose all the work I put into tagging those photos in Google Photos.  I'll have to think about that some more.
6.  I also still have to go through an delete all the duplicates in those folders.  But, this whole process is looking good.
7.  EDIT:  Several of the old Takeout folders have photo files left in them.  I'm going to have to examine them to figure out what happened.

Again, thanks for the help.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

StarGeek

Quote from: elopsentorml on November 28, 2019, 11:36:36 PM
4.  There are also some ".jpg_original" files left behind.  I'm guessing those were the originals of files I edited on Google Photos and the edited versions are what got moved.  I'll have to decide what to do with those.

Those are the backup files exiftool creates when it processes files.  From the docs:
   By default the original files are preserved with _original appended to their names -- be sure to verify that the new files are OK before erasing the originals.

Quote5.  The files got moved into their Year folders and the duplicates are there with their suffixes.  I still don't know how I'm going to get the people/places tags into those (since they're only reflected in the names of the Takeout folders they came from).  It'd be a shame to lose all the work I put into tagging those photos in Google Photos.  I'll have to think about that some more.

It might be possible to copy part of the directory names into the files.  Though it might be difficult once they've been moved. There is the Directory tag which can be altered using the Advanced formatting feature in order to extract the needed parts of the path. 
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

elopsentorml

I've got -overwrite_original on the "merge" command, but forgot to add it to the others.  I assume it's valid for all the commands.  I'm fine with exiftool overwriting the originals in this case because I've got all the original data from Takeout stored away someplace else.  I'll change the commands and try the whole process over.  Thanks.

The Directory tag sound promising.  But, I think the big issue would be having to search for duplicates of each file and add the directory information as a tag to one "original" version for each of the duplicates.

EDIT:  Do we lose the ability to Modify posts after they're a day old?  I was keeping my OP up-to-date, but now I no longer have a "Modify" button on that one.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

Phil Harvey

The maximum time for allowing edits of posts in this forum is configurable, and is currently set to 12 hours.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

elopsentorml

#9
OK.  I'll add my current set of commands here, then.  Note that 1) everything is under "D:\GooglePhotos", 2) so far it looks like I have .jpg, .jpeg, .tif and .bmp files, and 3) I have the original data stored elsewhere so I don't care if files are overwritten:

  • Find bad tags:
exiftool -r -Exif:StripOffsets -Exif:Compression -Exif:RowsPerStrip -Exif:StripByteCounts -Exif:Software -if '$EXIF:StripOffsets or $Exif:RowsPerStrip or $Exif:StripByteCounts' -ext jpg  -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"
  • Fix bad tags:
exiftool -r -all= -tagsfromfile @ -all:all -unsafe -icc_profile -ext jpg  -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"
  • Merge .json file (with photo extension substring) information with photo files:
exiftool -r -wm cg -tagsfromfile "%d/%F.json" "-Caption-Abstract<Description" "-ImageDescription<Description" -Description "-GPSAltitude<GeoDataAltitude" "-GPSLatitude<GeoDataLatitude" "-GPSLatitudeRef<GeoDataLatitude" "-GPSLongitude<GeoDataLongitude" "-GPSLongitudeRef<GeoDataLongitude" -ext jpg -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"
  • Merge .json file (without photo extension substring) information with photo files:
exiftool -r -wm cg -tagsfromfile "%d/%f.json" "-Caption-Abstract<Description" "-ImageDescription<Description" -Description "-GPSAltitude<GeoDataAltitude" "-GPSLatitude<GeoDataLatitude" "-GPSLatitudeRef<GeoDataLatitude" "-GPSLongitude<GeoDataLongitude" "-GPSLongitudeRef<GeoDataLongitude" -ext jpg -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"
  • Move photo files to Year-based folders:
exiftool -r -d "D:\Result\%Y/%%f%%-c.%%e" "-FileName<DateTimeOriginal" -ext jpg -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"

Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

elopsentorml

Well, I expected some photos to not get moved over to their "final" positions.  But, after the:

exiftool -r -d "D:\Result\%Y/%%f%%-c.%%e" "-FileName<DateTimeOriginal" -ext jpg -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"

to move everything under their Year folder (out of the way), I ended up with "1304 image files unchanged."  Here's one of them:

ExifTool Version Number         : 11.77
File Name                       : Cats_Black_Black_511583_1440x2560.jpg
Directory                       : D:/GooglePhotos/Wallpaper
File Size                       : 868 kB
File Modification Date/Time     : 2019:11:29 08:04:28-08:00
File Access Date/Time           : 2019:11:29 08:04:28-08:00
File Creation Date/Time         : 2019:11:29 07:39:35-08:00
File Permissions                : rw-rw-rw-
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
JFIF Version                    : 1.02
Exif Byte Order                 : Big-endian (Motorola, MM)
Image Description               :
X Resolution                    : 300
Y Resolution                    : 300
Resolution Unit                 : inches
Y Cb Cr Positioning             : Centered
GPS Version ID                  : 2.3.0.0
GPS Latitude Ref                : North
GPS Longitude Ref               : East
GPS Altitude                    : 0 m
Current IPTC Digest             : 14636aa63c1f98be7f3c11fad46b0219
Caption-Abstract                :
Application Record Version      : 4
XMP Toolkit                     : Image::ExifTool 11.77
Description                     :
Image Width                     : 1440
Image Height                    : 2560
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:4:4 (1 1)
Image Size                      : 1440x2560
Megapixels                      : 3.7
GPS Latitude                    : 0 deg 0' 0.00" N
GPS Longitude                   : 0 deg 0' 0.00" E
GPS Position                    : 0 deg 0' 0.00" N, 0 deg 0' 0.00" E


There's a File Creation Date/Time that looks correct, but no Date/Time Original.  I assume that's the problem.  I also assume I could add something like:

"-FileName<FileCreationDateTime"

in to the reorganizing command (probably before the "-FileName<DateTimeOriginal" option) and that would handle it.  I'll give it a try and see.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

Phil Harvey

Quote from: elopsentorml on November 29, 2019, 12:24:12 PM
There's a File Creation Date/Time that looks correct, but no Date/Time Original.  I assume that's the problem.  I also assume I could add something like:

"-FileName<CreationDateTime"

Add the -s option when extracting to see the actual tag names.  "CreationDateTime" is not a valid tag name.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

elopsentorml

FileCreateDate seems to have caught most of them (I have to confirm).  Still have 107 left, but I'm getting there.  I'll check out the -s option, too.

EDIT:  Oops.  I've got to slow down with these edits.  Yes, the FileCreateDate seems to have worked (I've got to look more in depth, though).  But, it looks like one of my earlier commands set the create date to today.  I'll have to research some more.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64

Phil Harvey

If you actually modify the file with exiftool, then the filesystem date/times will get set to the time you modified the file unless you use the -P option when writing.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

elopsentorml

#14
I'm not sure which of my activities changed those times, but they're definitely not the right field for me to use.  It looks like I'll have to go back to one of my earlier ideas and merge the Photo Taken Times in with something like "-DateTimeOriginal<PhotoTakenTimeTimestamp" -d %s.  So:

exiftool -r -wm cg -tagsfromfile "%d/%F.json" "-Caption-Abstract<Description" "-ImageDescription<Description" -Description "-DateTimeOriginal<PhotoTakenTimeTimestamp" -d %s" "-GPSAltitude<GeoDataAltitude" "-GPSLatitude<GeoDataLatitude" "-GPSLatitudeRef<GeoDataLatitude" "-GPSLongitude<GeoDataLongitude" "-GPSLongitudeRef<GeoDataLongitude" -ext jpg -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"

Does that look reasonable?

EDIT:  Nope.  The command I listed, above, fails with:  "The filename, directory name, or volume label syntax is incorrect."  But, if I remove the "-DateTimeOriginal..." stuff, that command runs and works.  If I run a separate command before that just to handle the DateTimeOriginal:

exiftool -r -tagsfromfile "%d/%f.json" "-DateTimeOriginal<PhotoTakenTimeTimestamp" -d %s -ext jpg -ext jpeg -ext tif -ext bmp -overwrite_original "D:\GooglePhotos"

it runs fine, but apparently does nothing.  Still trying to figure that out.
Windows Executable version 11.77 of ExifTool on Windows 10 Pro x64