Searching for files with missing metadata

Started by th47thman, January 29, 2022, 02:32:13 PM

Previous topic - Next topic

th47thman

Hello! This is a weird one, and I will apologize in advance if my question isn't in quite the right spot, but:

Is it possible to use ExifTool to search for files that are missing chunks of metadata?

Backing up a bit, I have recently discovered that there are a number of files in my personal music collection that, somewhere along the way, have been corrupted. I'm on a Mac, and these files are now unplayable-- either on my phone or in the App Formerly Known as iTunes. You can't actually tell there's anything wrong in the app itself, until you try and play a bad track; it'll just skip right over said track. The info for the bad tracks doesn't display incorrectly, so I can't Smart Playlist my way to glory in terms of sniffing out which tracks have gone bad. However! There is a different in terms of the metadata that shows up in Finder when you look at a bad track vs. a good track. A good track shows the following pieces of data:

  • Created
  • Modified
  • Title
  • Duration
  • Authors
  • Audio Channels
  • Sample Rate
  • Album
  • Musical Genre
  • Composer
  • Year Recorded
  • Comments

Bad files only show this:

  • Created
  • Modified
  • Duration
  • Audio Channels
  • Sample Rate

I wandered over to Apple's forums to see if there was a way to use Finder to search for blank metadata fields. Someone there hipped me to ExifTool, and eventually came up with the following search string:

exiftool -s -Filepath -if '(not $Title or not $Artist or not $AudioSampleRate or not $Album or not $Genre or not $MediaCreateDate)' -r -ext mp3 -ext m4a ~/Music

The notion here is that the search will look for file where any of the various fields are blank. This brings back a whole mess of files as not every track has every single field filled out. What would be more helpful (and what my original source couldn't quite get to) is a string that looks for fields where ALL (or maybe a majority) of these fields are blank. Can this be done? TIA.

StarGeek

Quote from: th47thman on January 29, 2022, 02:32:13 PMa string that looks for fields where ALL (or maybe a majority) of these fields are blank. Can this be done? TIA.

To check for all of these tags, change the OR to AND.  To check for most would be much more complicated, probably requiring the creation of a User Defined tag.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

th47thman

This is the part where I get to say "I thought I had tried that, but...

Indeed, I had flipped the ORs to ANDs without much difference in the number of results it brought back. But then! I took another look at the string I had used, and realized that there was still one OR statement lurking in the mix. Flipping that and running the resulting script took my number of hits down to 11, and I think this is going to end up being the valid number.

Thanks for the sanity check!

greybeard

This is interesting - I was trying to come up with a solution for the "or maybe a majority" case.

This works - but includes awk - is there a pure exiftool solution?

exiftool -p '${Title;$_ = 1} ${Artist;$_ = 1} ${AudioSampleRate;$_ = 1} ${Album;$_ = 1} ${Genre;$_ = 1} ${MediaCreateDate;$_ = 1} ${FileName}' -r -T -ext mp3 -ext m4a ~/Music | awk '{tags = $1+$2+$3+$4+$5+$6 ; if (tags > 4) print $0  }'

Phil Harvey

Here is one way:

exiftool -if '$Title and ++$c;$Artist and ++$c;$AudioSampleRate and ++$c;$Album and ++$c;$Genre and ++$c;$MediaCreateDate and ++$c;$c >4' ...

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

greybeard

Quote from: Phil Harvey on January 30, 2022, 08:13:59 AM
Here is one way:

exiftool -if '$Title and ++$c;$Artist and ++$c;$AudioSampleRate and ++$c;$Album and ++$c;$Genre and ++$c;$MediaCreateDate and ++$c;$c >4' ...

- Phil

Very elegant - I'm sure I'll forget this when I next need it