After reading Christian Etter's post on benchmarking exiftool, I wanted to get a better understanding of batch processing. I searched through the forum, but have had a difficult time finding a resource for batch processing. Working examples, etc.
I am using exiftool to read a variety of audio filetypes (.mp3, .flac, .ogg. .m4a). I created an argfile with the following switches/flags:
-json
-artist
-album
-filename
-picture
-coverart
-stay_open
True
I have been doing this in Ruby, but it's pretty much the same for bash. Basically, this is how I run the command to output the wanted exif data to a json file.
find . -iname "*.mp3" -exec exiftool -@ 'argfile' {} >> 'output.json' \;
That's not exactly what I'm doing, but you get the point, I think. Currently, this takes about 2 or 3 minutes to spit out the data of about 1000 files. I was hoping to speed this up, because I plan to use this on a larger set of data 100k+ files. Threading is one option, but Christian Etter's post made it sound like exiftool had a way to handle batch processing. As I mentioned above, I'm not sure where to get started.
Thanks,
Culley
I figured out what I could do. With find, I was running exiftool each time I passed a file to it with -exec. That's the slow part. Using exiftool with the -r switch and -stay_open true made things much faster. Cut my time down from 2.3 minutes to about 15seconds.
Here is the bash version of my script, in case it helps someone. I am using the environment variable XDG_MUSIC_DIR as the root directory to be processed.
#!/bin/bash
ext=(MP3 M4A FLAC OGG)
for e in ${ext[@]}
do
exiftool -json -ext "$e" -stay_open true -r $XDG_MUSIC_DIR >> music_files.json
done
Try this instead:
#!/bin/bash
exiftool -json -ext MP3 -ext M4A -ext FLAC -ext OGG -r $XDG_MUSIC_DIR > music_files.json
(note that I removed the -stay_open -- it no effect because you weren't using -@)
- Phil