Trying to get started with ExifTool

Started by emccainaz, October 31, 2011, 05:19:53 PM

Previous topic - Next topic

emccainaz

I'm a non-programmer trying to help a non-profit organization with their DAM system. Right now I have a text dump from Microsoft SharePoint that I've cleaned up in Excel and need to move that metadata into about 5K image files. So, I'm trying to learn how ExifTool works and am trying to run a small test. I've set up a simple table in Excel with the following column headings: SourceFile, description, comment, keyword. I've placed the .csv file with data for 10 files in each column (and the column heading). Then, on my Mac, I'm opening ExifTool in Terminal, going into the directory with the .csv and jpeg files in it and entering the following:

EMcs-QC-1000:TestFiles emccainaz$ exiftool -csv="Exif_Test" .
Error opening CSV file 'Exif_Test'
No SourceFile './Test copy 1.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 1.jpg')
No SourceFile './Test copy 10.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 10.jpg')
No SourceFile './Test copy 2.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 2.jpg')
No SourceFile './Test copy 3.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 3.jpg')
No SourceFile './Test copy 4.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 4.jpg')
No SourceFile './Test copy 5.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 5.jpg')
No SourceFile './Test copy 6.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 6.jpg')
No SourceFile './Test copy 7.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 7.jpg')
No SourceFile './Test copy 8.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 8.jpg')
No SourceFile './Test copy 9.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/TestFiles/Test copy 9.jpg')
    1 directories scanned
    0 image files read


Am I getting close? What do I need to do to make this work?

Thanks,

Edward McCain

Phil Harvey

Hi Edward,

Sounds like you are on the right track.  I can't tell if your CSV file is in the correct format, but your description sounds good.

The problem is that exiftool can't find the .csv file.  It is likely it has an extension (which is maybe hidden by the OS X GUI).  Try listing the directory contents with this command to see what the actual filename is:

ls .

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

Yes, I forgot to add the .csv as a file extension, so I did get it to work, sort of! I am only seeing the description, not the comment or keyword. I'm looking at the files in Photo Mechanic. I would like to use all IPTC fields. I did find the IPTC Tags you kindly provided. Is the problem that I should have used the tag "Keywords" instead of the singular "Keyword"? But how to handle multiple keywords, which is usually the case. I think my larger data file is tab separated, so that the data can contain commas without throwing the import off.

Also, the original files are renamed, which means I'll have to do some cleanup. Is there a way to reverse this so that the originals retain their name and the updated files get the appended name?

Thanks,

Edward

Phil Harvey

Quote from: emccainaz on October 31, 2011, 07:49:26 PM
Yes, I forgot to add the .csv as a file extension, so I did get it to work, sort of! I am only seeing the description, not the comment or keyword. I'm looking at the files in Photo Mechanic.

Reading FAQ number 3 may help here.

QuoteBut how to handle multiple keywords, which is usually the case.

Reading FAQ number 17 may help here.

QuoteAlso, the original files are renamed, which means I'll have to do some cleanup. Is there a way to reverse this so that the originals retain their name and the updated files get the appended name?

You can use the -o option to output the updated files to wherever you want or with whatever name you like.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

I've been experimenting some more and am confused about how ExifTool metadata tags. For instance, if I use the following fields in the csv file:
SourceFile,   Description,   Location,   CreatorContactInfo,   Keywords
I only get my one keyword embedded in the JPEG files.

If I change Keywords to Keyword, so that the following fields are in the csv file:
SourceFile,   Description,   Location,   CreatorContactInfo,   Keyword
(just changing Keywords to Keyword) then I get my description and location metadata embedded in the JPEGs.

What am I doing wrong here? I have looked at the tags and I'm sure I don't understand that much of it, but I would appreciate knowing how to select the right tags (mostly IPTC, but standard XMP fields would be okay, too - I'm viewing the information in Photo Mechanic and will eventually import the image files into ResourceSpace, which uses ExifTool to recognize metadata as I understand things.

Thanks,
Edward

Phil Harvey

Hi Edward,

For importing list-type tags with CSV, you will need to use the -sep option to split them back into a list.

My guess is that PhotoMechanic is ignoring the XMP if you write IPTC to the file.  Try this header line instead:

SourceFile,XMP:Description,XMP:Location,XMP:CreatorContactInfo,XMP:Subject

Here I have been more specific about where the information is written, although I believe that XMP is the default location for all these tags if the "XMP:" group is omitted.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

Hi Phil,

So far so good. I have successfully identified and mapped 19 XMP fields and have successfully moved data from a .csv file to a set of 55 test files of various formats! Hooray! I'm amazed that I have gotten this far - many thanks to your help.

Before I start working with the whole 5,000 files (maybe I should break the files into smaller groups???) I'm wanting to understand more about how to use the -o (Outfile) option. I see that you have a FMT string (does that mean Format String?), but I'm not sure how to implement it. I just want to move the original files into another folder. That folder can be nested inside the current folder or at the same level, whichever is easier to specify, since I'm easily confused by CLI interfaces. I also don't want to rename the files, just move them.

I'm also stumped by some files that don't want to update properly. Here's the message I get:



EMcs-QC-1000:CBD_Salamanders emccainaz$ exiftool -csv="CBD_Salamanders.csv" .
No SourceFile './CaliforniaTigerSalamander_GeraldAndBuffCorsi_©CaliforniaAcademyofSciences_PAY_1.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/CBD_Salamanders/CaliforniaTigerSalamander_GeraldAndBuffCorsi_©CaliforniaAcademyofSciences_PAY_1.jpg')
No SourceFile './CaliforniaTigerSalamander_GeraldAndBuffCorsi_©CaliforniaAcademyofSciences_PAY_2.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/CBD_Salamanders/CaliforniaTigerSalamander_GeraldAndBuffCorsi_©CaliforniaAcademyofSciences_PAY_2.jpg')
No SourceFile './LarchMountainSalamander_©1998HankWallays_FPWC.jpg' in imported CSV database
(canonical path: '/Users/emccainaz/CBD_Salamanders/LarchMountainSalamander_©1998HankWallays_FPWC.jpg')
No SourceFile './LarchMountainSalamander_©1998HankWallays_FPWC.tif' in imported CSV database
(canonical path: '/Users/emccainaz/CBD_Salamanders/LarchMountainSalamander_©1998HankWallays_FPWC.tif')
Warning: [minor] Ignored APP1 XMP segment with non-standard header - ./SiskiyouMountainsSalmander_(c)2009RichardDBartlett_CalPhotos_NP_1.jpeg
Warning: [minor] Ignored APP1 XMP segment with non-standard header - ./SiskiyouMountainsSalmander_(c)2009RichardDBartlett_CalPhotos_NP_2.jpeg
    1 directories scanned
   51 image files updated


The first column of my csv file looks like this (edited to show just problem files):

SourceFile
CaliforniaTigerSalamander_GeraldAndBuffCorsi_©CaliforniaAcademyofSciences_PAY_1.jpg
CaliforniaTigerSalamander_GeraldAndBuffCorsi_©CaliforniaAcademyofSciences_PAY_2.jpg
LarchMountainSalamander_©1998HankWallays_FPWC.jpg
LarchMountainSalamander_©1998HankWallays_FPWC.tif
LarchMountainSalamander_BillLeonard_FPWC.tif

Thanks for any help you can provide.

Edward


Phil Harvey

Hi Edward,

Special characters in filenames are a known problem with Windows.

The -o option will create a copy of the file unless you specify -overwrite_original.  I wouldn't put them in a sub-directory of the directory you are processing, or else you run the risk of recursively processing files forever.  The FMT argument is a format string.  See the -w option documentation for details.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

I'm working on a Mac using Terminal. The copyright symbol seems to work with some files and not with a few others. Should I go through and rename them all somehow?

I did look at the -w option documentation, but I'm still struggling with just how to do set up moving the files to a directory at the same level as the one I'm currently in. Can you give me an example of moving copies of the files without the "_original" string added to the filename, so they have the same filename?

Thanks,

Edward

Phil Harvey

Hi Edward,

OK, this should work with a Mac as long as you use UTF-8 encoding for your CSV file.

To copy the files from directory tree rooted at "SRCDIR" to a directory tree rooted at "DSTDIR", do this:

exiftool ... -r -o DSTDIR/  SRCDIR

To move them instead, add -overwrite_original.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

In testing this, I just discovered that ExifTool won't accept a CSV file in UTF-8 encoding if it begins with a byte order mark (BOM).  So be sure that your CSV file is UTF-8 with no byte order mark.  I will fix this in the next release.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

Hmmm. I've been doing some research and I'm not sure how to make sure the csv file is UTF-8 with no BOM. Is there an easy way to do this? Does exiftool give me that info? I've been working in Excel to make the csv files and I don't see any documentation about saving with these options.

Also, should the order of the commands be "exiftool -csv=CBD_Salamanders.csv -r -o ORIGINALFILES/ SOURCEFILES"

Thanks,

Edward

Phil Harvey

Hi Edward,

The order of the options doesn't matter, so what you have done is fine.

It could be that you don't have any control over the encoding of special characters as written by Excel.  I realize now that the BOM isn't a problem for you because otherwise ExifTool would complain about the file format.  But it seems that Excel isn't writing UTF-8.  The easiest way around this is probably to rename the files to remove the special characters.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

emccainaz

I have tried several ways of using the -o option, but am not able to get the originals into another folder yet. Here's what I get:


EMcs-MBP-240:CBD_Salamanders emccainaz$ exiftool -csv=CBD_Salamanders2.csv -r -o OriginalFiles/ CBD_Salamanders
No SourceFile 'CBD_Salamanders' in imported CSV database
(canonical path: '/Users/emccainaz/CBD_Salamanders/CBD_Salamanders')


As far as the metadata goes, I'm getting unreliable results. Sometimes all the fields get the metadata, but some files don't get anything and some files only get portions of the metadata from a particular field. As far as UTC-8 goes, I imported the csv from Excel and then exported it from google docs, which is supposed to be UTF-8, but I'm not absolutely certain. At any rate, that doesn't fix the irregular outcome.

I have also renamed all the files with the copyright symbol "©" to "(c)" but now I'm wondering if the parentheses are also causing problems for ExifTool. It is exciting to get some of the desired results, but with 5000 files to go, I'm also anxious to figure this out so we can move on to the next phase of the project.

Thanks very much for your help thus far.

Edward

Phil Harvey

Hi Edward,

I'm sorry.  I didn't understand that you wanted to move the originals.  ExifTool won't do this.  You only have control over where the edited images are written.

Brackets in a filename shouldn't be a problem on the Mac.

If you provide more details I can probably figure out why you are getting inconsistent results.  I'm guessing there is a problem with the format of some lines in the CSV file.  Try posting the lines from files where you have a problem, and describe exactly what isn't getting written correctly.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).