Special character as separator in output file

Started by trudge, February 21, 2018, 02:34:59 PM

Previous topic - Next topic

trudge

I'm wondering if I can use a pipe (vertical bar) as a field separator in the output file. I'm a bit of a Perl mangler and like that character as it is rarely used in normal situations. Working on a book library and pulling data out of .epub and .pdf files - Author, Title, Create Date, etc. If I can use a pipe character in the file, I can then split each line and get what I need for each book.

Thoughts? Insights?
Thank you.

StarGeek

"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

trudge

Thank you StarGeek for the suggestion. I have tried -sep '|' but no joy. If I were to use -p what would the format of the output file be, and how to I specify that? That seems confusing a bit.
--
exif -ver: 10.36
OS: macOS High Sierra

StarGeek

The -sep option is for changing the separator between list type tags such as Keywords so instead of "Keyword 1, Keyword 2, Keyword 3", you would get "Keyword 1|Keyword 2|Keyword 3".  It won't affect non-list tags.

I'm not entirely sure what format you're looking for in the output, which is why I suggested taking a look at the -p option since that gives you more control.

Can you give a give a mockup of what you want your output to look like? Where are you trying to put the separator?
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

It sounds like you are looking for something roughly like this:

exiftool -p '$filename|$author|$title|$createdate' DIR > out.txt

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

trudge

Yes Phil, that is similar to what I'd like to have. I'm pulling specific data from .epub & .pdf files, and this format is useful later when I go to update a MySQL database. I'm piping the output of exiftool into egrep where I only want to grab 4 or 5 tags. So for example my output file might be

Anthropology for Dummies|Anthropology for Dummies.pdf|Smith, Cameron M.; Davies, Evan T.|2008|D|N|1

Now I can walk through a file (with several other book data) like this and update a database easily. I should mention that this is all done in a Perl script so I'm using something like this:
system("exiftool " . shell_quote($file) . " | egrep 'Title' >> $exifFile");
system("exiftool " . shell_quote($file). " | egrep 'Author' >> $exifFile");
system("exiftool " . shell_quote($file). " | egrep 'Create' >> $exifFile");

I can't seem to get it working if I put all 3 requests into 1 line.

Would the -p option get me close to what I need?

Phil Harvey

Now I have no idea what you want.  Your egrep output is very different from the |-delimited output I thought you wanted.   Combining your three egrep commands is easy:

exiftool -title -author -createdate -q FILE

But it is much faster if you can run exiftool once for all files instead of on each one.  To do this, you need to get the file name into the output.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

trudge

Hmm. Let me try again. What I would like is an output file with the tag data delimited by a pipe character, like this:

Anthropology for Dummies|Anthropology for Dummies.pdf|Smith, Cameron M.; Davies, Evan T.|2008

My question is: can I run exiftool to specify that pipe character as a delimiter, and if so how?

This will be run from a Perl script.

ekkidee

I just run

exiftool -T arguments |tr '\t' '|'

-T puts everything on one line separated by tabs. `tr` converts the tab to the v-bar character.

trudge

That looks interesting ekkidee - thank you for the idea. I did try that but not getting quite what I'd like.
What are the 'arguments' mentioned?

My version of exiftool defaults to a tab & colon as field separator:
Title                           : Anthropology for Dummies
Author                          : Cameron M. Smith; Evan T. Davies
Create Date                     : 2008:07:11 21:17:44-04:00


Building on your idea this is what I've tried:

exiftool "Anthropology for Dummies.pdf" | egrep 'Author|Title|Create' | sed 's/ *: /|/' > out.txt

Result:

Title|Anthropology for Dummies
Author|Cameron M. Smith; Evan T. Davies
Create Date|2008:07:11 21:17:44-04:00

Could I use your idea to get everything on one line like this?

Anthropology for Dummies|Cameron M. Smith; Evan T. Davies|2008:07:11 21:17:44-04:00

Phil Harvey

Quote from: trudge on February 21, 2018, 08:45:28 PM
My question is: can I run exiftool to specify that pipe character as a delimiter, and if so how?

I'm still confused.  I thought my previous example (using the -p option) did just that.  :/

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

ekkidee

Quote from: trudge on February 22, 2018, 04:17:23 AM
That looks interesting ekkidee - thank you for the idea. I did try that but not getting quite what I'd like.
What are the 'arguments' mentioned?

My version of exiftool defaults to a tab & colon as field separator:
Title                           : Anthropology for Dummies
Author                          : Cameron M. Smith; Evan T. Davies
Create Date                     : 2008:07:11 21:17:44-04:00


Building on your idea this is what I've tried:

exiftool "Anthropology for Dummies.pdf" | egrep 'Author|Title|Create' | sed 's/ *: /|/' > out.txt

Result:

Title|Anthropology for Dummies
Author|Cameron M. Smith; Evan T. Davies
Create Date|2008:07:11 21:17:44-04:00

Could I use your idea to get everything on one line like this?

Anthropology for Dummies|Cameron M. Smith; Evan T. Davies|2008:07:11 21:17:44-04:00

Try this ....

exiftool -T -filename -title -author -createdate *.pdf |tr '\t' '|'

You need the -T to put it all on one line, and separate output with tabs. Behavior of the various grep's and egrep's is not always consistent with regards to newlines.

For pdfs and ebooks, I do not know the precise tags, but it looks, from above, as if you have already identified those. Also, you may not need -filename.

The -p option is very powerful and flexible with formatting output as needed. I would get the basic form working first before attempting to use -p.

trudge

Hello Phil,

Yes that does put in the pipe symbol, but I'm confused as well. What are those variables referring to? You mention images?

I tried

exiftool -p 'Author|Title|Create' "Anthropology for Dummies.pdf" > out.txt


and got

Author|Title|Create


as out.txt. Almost there, but I'm looking for the data, not the field name.

Phil Harvey

I think you missed the point.  Type the command I gave exactly, except for replacing "DIR" with file and/or directory names.  The "$filename", "$author" etc are in single quotes so they aren't interpreted by the shell -- they are interpreted as tag names by exiftool in the -p option argument .

Read about the -p option for more details.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

trudge

OMG. Now I feel like 2 cents waiting for change. I did not 'get' that particular info from the documentation. When I do it LIKE YOU SAID I get

Cameron M. Smith; Evan T. Davies|Anthropology for Dummies|2008:07:11 21:17:44-04:00

which is exactly what I want.

My bad. Thank you so much for your patience (and persistence) - maybe I was making it too hard.
Thank you all and this thread can now be marked SOLVED.