Different output to CSV

Started by DNichols, May 16, 2013, 07:29:50 PM

Previous topic - Next topic

DNichols


extracting meta elements from HTML on the command line (version: 9.28, Windows 7):

exiftool.exe    -HTML:HTML-dc:All -a -G1 -s effects-edited.htm

extracts all this (which is correct):

[HTML-dc]       Relation                        : test_relation_1, test_relation_2, test_relation_3
[HTML-dc]       Identifier                      : test_id_value_1
[HTML-dc]       Identifier                      : test_id_value_2
[HTML-dc]       Subject                         : test_subject_value_1, test_subject_value_2
[HTML-dc]       Title                           : test_title_value_1
[HTML-dc]       Title                           : test_title_value_2


Adding the -csv output option:

exiftool.exe    -HTML:HTML-dc:All -a -G1 -s -csv effects-edited.htm

SourceFile,HTML-dc:Relation,HTML-dc:Identifier,HTML-dc:Subject,HTML-dc:Title
effects-edited.htm,"test_relation_1, test_relation_2, test_relation_3",test_id_value_2,"test_subject
_value_1, test_subject_value_2",test_title_value_2



test_id_value_1 and test_title_value_1 don't appear in the CSV output. Is that right or am I missing something?

Phil Harvey

Email me the sample HTML file and I'll track this down.  (philharvey66 at gmail.com)

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

#2
Thanks for the sample.

I'll fix ExifTool to tolerate leading white space in the HTML file as you suggested in your email.

The problem here is that the column headings in the CSV file must be unique.  Since you are using -G1, the column headings for both copies of Identifier are identical.  Try using -G4:1 instead.  Adding the group 4 family name guarantees that all of the tags produce unique headings.

- Phil

Edit:  I'll add a note to the documentation to explain this -csv feature.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).