Removing images from files (keep only metadata)

Started by Martin B., September 24, 2015, 09:30:41 PM

Previous topic - Next topic

Martin B.

Hi,

Does anyone know of a way to remove the images from files and keep only the metadata?

I'm building test cases for a program I wrote that uses the ExifTool API. I have image and video files from various sources that I use, but I'd like to remove the "real contents" from these files and keep only the metadata. The first reason is to save bandwidth and disk space, and the second reason is I don't want to show these images to the whole world if I ever distribute the test cases (I don't have model disclosures for publication, etc.).

The only solution I could think of is to create a "blank" picture, copy to this picture as many times as required and copy to these new instances the metadata from the "real" test cases. I'm concerned that the metadata will not transfer well in all cases.

Creating all new test pictures is not an option because I don't have the means to create all the formats from scratch (I don't own a Nikon to create .nef files, etc.). Editing the images is not an option either (Photoshop cannot write .CR2 files).


Thanks,

Martin

Phil Harvey

Hi Martin,

Feel free to use the test images in t/images of the full Image-ExifTool distribution.  I've done a similar thing as you are trying to do for your test files.  I have a script that does this automatically for JPEG images (and have posted all of my JPEG samples here), but did all of the other formats by hand.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Martin B.

Hi Phil,

Thanks for sharing your test images; I tried adding the kind of metadata I need for my tests and it worked on a few image types, so this might do the trick.

I'm curious as to how you removed the images from the files, especially for files types without a public format specification. In particular, the .CR2 file I tried from your collection is unreadable by Adobe tools. Did you just truncate files after the metadata headers?

Martin

Phil Harvey

Hi Martin,

If you're talking about the images in the t/images directory of the full distribution, many of these files do not contain valid image data.  They are uses for testing metadata only.  Usually they aren't truncated, but instead dummy image data is substituted.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Martin B.

Hi Phil,

I've made some good progress using the test images in t/images (thanks!), written a function to remove video from AVI files (by truncating) and another to minimally deal with MPEG files. However, I'm discovering that JPEG files vary by manufacturer more than I had anticipated. You mentioned you have a script that removes  image data from JPEG files; would you mind sharing this script with me? I'd use it to create JPEG test files with the metadata that I need to test a PERL/ExifTool script I'm writing to reorder and rename files based on date, FileNumber (internal numbering) and original file name.

Thanks,

Martin

Phil Harvey

Hi Martin,

The script I use to substitute the image in a JPEG file is posted here

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Martin B.

Thanks Phil; that worked perfectly.

I added two options, to facilitate invoking your script from another one:
-s     - specify the image to be swapped in (defaults to t/images/Writer.jpg)
-q     - quiet

If these options are not provided on the command line, your script behaves just as before.

See attached, in case you or someone else finds it useful.

Martin