Best approach for metadata migration

Started by ea1987, April 03, 2020, 02:48:58 PM

Previous topic - Next topic

ea1987

Hi guys,
I'm a newbie here, but pretty much a metadata enthusiast :)
I have a bunch of pictures (almost 50k) which I have correctly tagged untill now, using WLPG (Windows Live Photo Gallery) inserting the following information:
- people tags (MP tags)
- CountryName, ProvinceState and City (IPTC tags)

Now that I'm moving to Digikam, I would like to use most used standards for people tags (XMP) and morover I would like to add GPS location according to IPTC City tag.
As I would like to avoid manual operations, I'm thinking of write a quick application (java or bash) that will do that for me automatically. I think that these are the steps to follow:

1. Convert people tags: exiftool -config ExifTool_config_convert_regions "-regioninfo<myregion" -r <jpeg_path>
2. Export City tags on a csv. Q: is there a proper exiftool function for doing this?
3. Read the exported csv and for its unique value find related latitude and longitude information (using api on internet, any suggestion about it? :) ). Q: do you have generic suggestions about an alread existing tool for doing this?
4. Merge latitude/longitude results with the existing cities tag and create the final csv file which will be used to import tags using proper exiftool csv command option.

Ok, I think that I will take at least half a day for doing this. Of course if you have some suggestions about this will be very appreciated. Of course the final code will be shared on github :)

Thank you guys!

Andrea

Phil Harvey

Hi Andrea,

Quote from: ea1987 on April 03, 2020, 02:48:58 PM
... pretty much a metadata enthusiast

Welcome to the club. ;)

Quote1. Convert people tags: exiftool -config ExifTool_config_convert_regions "-regioninfo<myregion" -r <jpeg_path>

It sounds like you have done some research here.  However, the file included in the ExifTool distribution is called "convert_regions.config"

Quote2. Export City tags on a csv. Q: is there a proper exiftool function for doing this?
3. Read the exported csv and for its unique value find related latitude and longitude information (using api on internet, any suggestion about it? :) ). Q: do you have generic suggestions about an alread existing tool for doing this?
4. Merge latitude/longitude results with the existing cities tag and create the final csv file which will be used to import tags using proper exiftool csv command option.

Why go through a CSV file?  This could all be done using a dedicated ExifTool config file and a user-defined Composite tag to convert Country/City names to GPS position via the Google Maps API.  ExifTool would probably have been distributed with a config file do something just like this except for the problem that the Google Maps API is no longer freely accessible -- Google now charges for using it beyond some limited number of accesses, which effectively means that every user must set up their own Google Maps account because it is expensive to maintain a central account for all users of an app.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

ea1987

Hi Phil,
thank you for the quick answer :)

Uhm, that's suggested way (custom user defined file) sounds interesting to me.
Is there a main custom config files repository? I tried to find out on the main website, but couldn't find it. I mean, the only thread I found similar to my topic is this one https://exiftool.org/forum/index.php/topic,9862.0.html
So, according to what you suggested me, basically a custom config file could use the specific api (eg. g maps) parse the output and insert results into gps tags?
I didn't think exiftool could be so powerful. Or maybe I'm misunderstanding?
Thanks!

StarGeek

Quote from: ea1987 on April 03, 2020, 05:11:43 PMthe only thread I found similar to my topic is this one https://exiftool.org/forum/index.php/topic,9862.0.html

All that config file does is grab the gps coordinates embedded in the file and creates an url that can be saved or passed to a browser to see the position on some of the various map sites.  It doesn't actually use the API of any website. 

For example,
C:\>exiftool -config GPS2MapUrl.config -g1 -s -gpsL* -*Url* y:\!temp\Test4.jpg
---- Composite ----
GPSLatitude                     : 40 deg 41' 21.12" N
GPSLongitude                    : 74 deg 2' 40.20" W
BingMapsUrl                     : https://bing.com/maps/default.aspx?cp=40.6892~-74.0445&sp=point.40.6892_-74.0445_.
GoogleMapsUrl                   : https://www.google.com/maps/search/?q=40.6892,-74.0445
MapquestMapsUrl                 : https://www.mapquest.com/?q=40.6892,-74.0445
OpenStreetMapsUrl               : https://www.openstreetmap.org/?mlat=40.6892&mlon=-74.0445
YandexMapsUrl                   : https://yandex.com/maps/?ll=-74.0445%2C40.6892&text=40.6892%2C-74.0445


"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

ea1987

Right. So: is that a custom config file able to instruct exiftool to use a tag (eg. city) in order to retrieve GPS informations using an external API and add them to proper tags? I mean, if exiftool is capable of doing this with a custom external code it would be great! Otherwise I will write down some external code to integrate this functionality, that's not a problem.
Thanks

Phil Harvey

StarGeek has posted an example config file that accesses the Google Maps API here.  That config file does something different, but I believe that API can be used to look up lat/lon based on the City name, etc.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

ea1987

Thank you Phil.
I'm trying a different approach, a custom bash script which will use exiftool as main tool for extracting and importing tags.
By the way I'm stuck Iptc4xmpExt:City tag export. I checked on https://exiftool.org/TagNames.pdf and for some examples on the forum, but it seems that there is something wrong with my command (even if it works correctly when I try to import different common tags, like WhiteBalance0 for example). I tried the following combination, but no one of them works fine:


./exiftool -CountryName -csv -r ...
./exiftool -XMP-iptcExt:CountryName -csv -r ...
./exiftool -IPTC:CountryName -csv -r ...
./exiftool -xmp-Iptc4xmpExt:CountryName -csv -r ...
./exiftool -iptcExt:CountryName -csv -r ...


This is the node which contains the needed data :)

<rdf:Description xmlns:Iptc4xmpExt="http://iptc.org/std/Iptc4xmpExt/2008-02-29/">
<Iptc4xmpExt:LocationCreated>
<rdf:Bag xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:li>
<rdf:Description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<Iptc4xmpExt:CountryName xmlns:Iptc4xmpExt="http://iptc.org/std/Iptc4xmpExt/2008-02-29/">Francia</Iptc4xmpExt:CountryName>
<Iptc4xmpExt:ProvinceState xmlns:Iptc4xmpExt="http://iptc.org/std/Iptc4xmpExt/2008-02-29/">IdF</Iptc4xmpExt:ProvinceState>
<Iptc4xmpExt:City xmlns:Iptc4xmpExt="http://iptc.org/std/Iptc4xmpExt/2008-02-29/">Parigi</Iptc4xmpExt:City>
</rdf:Description>
</rdf:li>
</rdf:Bag>
</Iptc4xmpExt:LocationCreated>
</rdf:Description>


Could you please help me on this?
Thanks in advance

Andrea

StarGeek

Technically, that one uses the Google Maps Time zone api.  There's a separate API for Geocoding.

With the time zone api, I was able to use exiftool's json decoder because only four data points were returned and each ended up with a unique tag name.  Unfortunately, that doesn't work for the geocoding api because exiftool's json decoder duplicates entries.  For example, using the gps coordinates for the Statue of Liberty (40.6892, -74.0445), Google returns this json (on Pastebin since it is 453 lines long, too big to post here without being overwhelming).

The results from running exiftool on that json looks like this:
---- JSON ----
ResultsAddress_componentsLong_name: Unnamed Road, Manhattan, New York, New York County, New York, United States, 10004, Liberty Island, Manhattan, New York, New York County, New Jersey, United States, 10004, Manhattan, BOWLING GREEN, New York County, New York, United States, Manhattan, New York, New York County, New York, United States, New York County, Manhattan, New York, New York, United States, New York, New York, United States, New York, United States, United States
ResultsAddress_componentsShort_name: Unnamed Road, Manhattan, New York, New York County, NY, US, 10004, Liberty Island, Manhattan, New York, New York County, NJ, US, 10004, Manhattan, BOWLING GREEN, New York County, NY, US, Manhattan, New York, New York County, NY, US, New York County, Manhattan, New York, NY, US, New York, NY, US, NY, US, US
ResultsAddress_componentsTypes  : route, political, sublocality, sublocality_level_1, locality, political, administrative_area_level_2, political, administrative_area_level_1, political, country, political, postal_code, neighborhood, political, political, sublocality, sublocality_level_1, locality, political, administrative_area_level_2, political, administrative_area_level_1, political, country, political, postal_code, political, sublocality, sublocality_level_1, locality, political, administrative_area_level_2, political, administrative_area_level_1, political, country, political, political, sublocality, sublocality_level_1, locality, political, administrative_area_level_2, political, administrative_area_level_1, political, country, political, administrative_area_level_2, political, political, sublocality, sublocality_level_1, locality, political, administrative_area_level_1, political, country, political, locality, political, administrative_area_level_1, political, country, political, administrative_area_level_1, political, country, political, country, political
ResultsFormatted_address        : Unnamed Road, New York, NY 10004, USA, Liberty Island, New York, NJ, USA, BOWLING GREEN, NY 10004, USA, Manhattan, New York, NY, USA, New York County, New York, NY, USA, New York, NY, USA, New York, USA, United States
ResultsGeometryBoundsNortheastLat: 40.6895609, 40.691185, 40.70684, 40.882214, 40.882214, 40.9175771, 45.015861, 71.5388001
ResultsGeometryBoundsNortheastLng: -74.04412719999999, -74.0435129, -74.00867629999999, -73.907, -73.907, -73.70027209999999, -71.777491, -66.885417
ResultsGeometryBoundsSouthwestLat: 40.6888993, 40.68854210000001, 40.6885304, 40.6803955, 40.6803955, 40.4773991, 40.4773991, 18.7763
ResultsGeometryBoundsSouthwestLng: -74.04499939999999, -74.0472852, -74.04723799999999, -74.047285, -74.047285, -74.25908989999999, -79.7625901, 170.5957
ResultsGeometryLocationLat      : 40.6891815, 40.6900495, 40.7038704, 40.7830603, 40.7830603, 40.7127753, 43.2994285, 37.09024
ResultsGeometryLocationLng      : -74.04413769999999, -74.0450675, -74.0138541, -73.9712488, -73.9712488, -74.0059728, -74.21793260000001, -95.712891
ResultsGeometryLocation_type    : GEOMETRIC_CENTER, APPROXIMATE, APPROXIMATE, APPROXIMATE, APPROXIMATE, APPROXIMATE, APPROXIMATE, APPROXIMATE
ResultsGeometryViewportNortheastLat: 40.69057908029149, 40.69121253029151, 40.70684, 40.882214, 40.882214, 40.9175771, 45.015861, 71.5388001
ResultsGeometryViewportNortheastLng: -74.0432143197085, -74.0435129, -74.00867629999999, -73.907, -73.907, -73.70027209999999, -71.777491, -66.885417
ResultsGeometryViewportSouthwestLat: 40.68788111970849, 40.68851456970851, 40.6885304, 40.6803955, 40.6803955, 40.4773991, 40.4773991, 18.7763
ResultsGeometryViewportSouthwestLng: -74.0459122802915, -74.0472852, -74.04723799999999, -74.047285, -74.047285, -74.25908989999999, -79.7625901, 170.5957
ResultsPlace_id                 : ChIJcetR1ohQwokRWP3w62AvqYo, ChIJRYGi0o5QwokRLsLNYgBkDgg, ChIJBxh0HZBQwokRQHMru52e-ng, ChIJYeZuBI9YwokRjMDs_IEyCwo, ChIJOwE7_GTtwokRFq0uOwLSE9g, ChIJOwg_06VPwokRYv534QaPC8g, ChIJqaUj8fBLzEwRZ5UY3sHGz90, ChIJCzYy5IS16lQRQrfeQ5K5Oxw
ResultsTypes                    : route, neighborhood, political, postal_code, political, sublocality, sublocality_level_1, administrative_area_level_2, political, locality, political, administrative_area_level_1, political, country, political
ResultsPostcode_localities      : BOWLING GREEN, New York
Status                          : OK


So get City/State/Country, you have to parse the ResultsAddress_componentsTypes and match them with the correct ResultsAddress_componentsLong_name.  Except the number of Types that match is variable. Sometimes one, sometimes two, sometimes three.

So some other json module is needed.  With ActivePerl and Strawberry Perl for Windows, all I had to do was add Use JSON; but you said you weren't able to get that to work, Phil.

Since I was able to rewrite the time zone api code to use cURL, I'll probably do the same for Google geocoding lookup.  But it will still require that Perl be installed so that it can access the json module.

That said, Google is really flexible about what you pass to it.  You can give it GPS Coordinates, City/State names, or even location names, such as just "Statue of Liberty".
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

StarGeek

Country -> XMP:Country
State -> XMP:State
City -> XMP:City
Location -> XMP:Location

Or for maximum compatibility, use the MWG tags, as they will fill all the appropriate tags.
Country -> MWG:Country
State -> MWG:State
City -> MWG:City
Location -> MWG:Location


Oops, nevermind.  You want the LocationCreated structured tags.
LocationCreatedCountryName
LocationCreatedProvinceState
LocationCreatedCity
LocationCreatedSublocation

Go to the iptcExt entry on the XMP page and scroll down to LocationCreated.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

ea1987

Thank you, Stargeek! I was missing that last part of search (locationcreated).

Phil Harvey

Quote from: StarGeek on April 06, 2020, 04:32:33 PM
Unfortunately, that doesn't work for the geocoding api because exiftool's json decoder duplicates entries.

If you extract using the -struct option then you will be able to step through the complete structured return ARRAY and HASH values from Perl.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).