Extract metadata into PostGreSQL database

Started by Sylvain M., April 20, 2020, 04:57:11 AM

Previous topic - Next topic

Sylvain M.

Morning, everyone,
I'm a modest developer, but I have a good base on SQL in PostGreSQL.
I'd like to export the metadata of my photos (> 130,000 at the moment) to a PostGreSQL database.
What do you think is the best way to do this?
Thank you for your advice.
Sylvain

Phil Harvey

#1
Hi Sylvain,

I have an example Perl script that converts the ExifTool -listx output into an SQLite database (attached).  This isn't quite what you want, and the resulting SQL database may be more complex that you need, but it may give you an idea of how to do this.

The -listx output is an XML formatted output of all pre-defined ExifTool tags.  For extracting information from a file, you would use the -X option, which is RDF/XML formatting.  Or maybe JSON formatting with the -j option would be easier.

- Phil

Edit:  It's been a while since I looked at this script.  It uses the API, not the command-line application, so it doesn't use the -X option of the command-line application as I mentioned.  Using the API to extract information from a file you wouldn't have to use XML at all.  Instead, you would call ImageInfo() and parse the returned tag information hash.  But writing the output to SQL could be done using a technique similar to the attached script.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Sylvain M.

Thank you, Phil, for your response,
My level of English and development skills make it a bit complicated for me to put into practice.
I only know the command line program, not the API yet, so I'll read up on it.
However, the fact that it's SQLite and not PostGreSQL is not a problem at all! I mostly master SQL, so whatever the engine, I'll be fine if the data is in base.

I'm looking at all this and come back here if I'm still blocking (which I probably will).

Sylvain M.

Hi everybody,
I'll come back to that.
Unfortunately, I don't know the Perl language at all, and I won't be able to adapt the transmitted code to my needs.
PostGreSQL comes with the "PSQL" program [1], which allows you to import data into a database.
I managed to import some "flat data" (in a single table) with this sequence of commands:
exiftool -T -r -Directory -filename -DateTimeOriginal -model -GPSLatitude# -GPSLongitude# -ext jpg -charset FileName=Latin "M:\Photos" > "M:\Photos\dbphotos.txt"
psql -h localhost -U postgres -d phototheque -c "COPY photos.dbphotos FROM 'M:\Photos\dbphotos.txt' NULL '-' ENCODING 'UTF-8';"


Note that the table "photos.dbphotos" was created in advance with this command:
psql -h localhost -U postgres -d phototheque -c "CREATE TABLE photos.photos_total (Directory text, filename text, DateTimeOriginal text, model text, GPSLatitude numeric, GPSLongitude numeric);"


But I only put a selection of main tags (I'm a cartographer, so GPS data is what interests me the most).
I leave these commands here if it can be useful to others, and especially if you have suggestions for improvements that allow me to get more tags, and write them in different tables to optimize processing time.

Thank you in advance for your comments, and my apologies for any comprehension difficulties (developmental and English level).

[1] https://docs.postgresql.fr/10/app-psql.html

Sylvain M.

#4
For those who are interested, thanks to Nicolas' help on a geomatics forum (link), I was able to write the previous command in a single line :
(we assume that the table is already present in the database.)
exiftool -T -r -Directory -filename -DateTimeOriginal -model -GPSLatitude# -GPSLongitude# -ext jpg -charset FileName=Latin "M:\Photos" | psql -h localhost -U postgres -d phototheque -X -c "copy photos.dbphotos(Directory, filename, DateTimeOriginal, model, GPSLatitude, GPSLongitude) from stdin NULL '-' ENCODING 'UTF-8'"

But I'm here in the case of a selection of tags.
If I wanted to put all the tags available, the JSON format would be more appropriate.
However, I have a problem with the integration of JSON in PostGreSQL, despite the fact that it handles this type of data well.
I will open a new dedicated thread for my JSON questions.