Overriding tag definitions

Started by perlcat, February 16, 2011, 11:39:57 AM

Previous topic - Next topic

perlcat

I inherited a few hundred million images, written by a large number of different devices.

I am working on an application to print some of them out (lawyers are not interested in going paperless yet). They run the gamut from ordinary g4 compression bilevel tiffs to pdf's assembled using various pdf and tiff libs. I have 99.99% of them working, but some were written using a version of Pixel Translations' writer that writes out StripByteCounts to be one byte greater than what is actually stored in the image.  Image::Magick and ghostscript do not read the images correctly, and throw an error, while the FileNet viewer and the Windows picture viewer display them just fine.

I can only hope that Pixel Translations fixed their 0-based indexing problem later, but as these images are legal documents, I need to find a way to deal with what I have.

What I've found is that of every single image that has this issue, if I decrement StripByteCounts by one, I can then use my other tools to read, convert, rasterize, and print them. There doesn't seem to be any missing data.

I know that this value can be set to writable so I can accomplish this -- but unsure as to whether I am doing it right. I have been reading the Exif.pm and WriteExif.pl file, and looking at the Config documentation -- I think that I am on the right track, but was wondering what you thought. Here is the UserDefined definition I came up with.

%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Exif::Main' => {
        0x117 => {
            Name => 'StripByteCounts',
            Writable => 'int32u',
            WriteGroup => 'IFD0',
        },
    };


Is that correct? I am greatly confused by the tag offset as opposed to the tag id, and was wondering if there was any way to get a better dump of the actual values that StripByteCounts uses, so that I can be a little more correct about this.

Phil Harvey

Quote from: perlcat on February 16, 2011, 11:39:57 AM
%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Exif::Main' => {
        0x117 => {
            Name => 'StripByteCounts',
            Writable => 'int32u',
            WriteGroup => 'IFD0',
        },
    };


Is that correct?

For a normal tag, this would be OK.  However, this tag is special and complicated because it refers to image data (in conjunction with StripOffsets), and the type of data is different depending on the context in which this tag is used.  You can see this in the definition for StripByteCounts in the source code for the ExifTool Exif module.

Unfortunately, this complication means that you can not use ExifTool to write this tag manually.  I just tried, and it breaks the internal logic that copies the image data.

QuoteI am greatly confused by the tag offset as opposed to the tag id

If you are talking about the value of 0x117, this is the tag ID (in hex).  But it is a moot point since you can't use ExifTool to write this tag.

Sorry.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

[from a private message]

Quote from: perlcat on February 16, 2011, 12:51:41 PM
When you say "can't", I'm going to assume that you mean really, really, really, shouldn't.

Because I just used ExifTool about 20 seconds ago to do it, and while it must be a bug, the disappearance of the config examples that do it and your reply tell me that you aren't going to want that posted, and I'm going to have to respect that, and not contradict your post. I assume you have their reasons. If it is just the 'users will make you crazy with stupid user tricks' answer, I'd appreciate knowing that. I can assure you that the code would never make it out in the public.

If you paste this into a file, save, and run it, it complains and does not write the new file.

However, if you do perl -l, and paste it into the command line, it complains but writes it with the amended StripByteCounts anyway.

use Image::ExifTool;
$exif=new Image::ExifTool;
$srcfile='fcd_OMA_IS_0112014609_1.tif';
$dstfile='fixed.tif';

$exif->ExtractInfo($srcfile);
@taglist=$exif->GetFoundTags();

foreach my $tag (@taglist)
        {
        if ($tag=~/StripByteCounts/)
                {
                $bytecount=$exif->GetValue($tag);
                print "$tag value is $bytecount \n";
                $bytecount--;
print "setting it to $bytecount\n";
                $rc=$exif->SetNewValue($tag, $bytecount);
                }
        }

print "writing it.\n";
$success = $exif->WriteInfo($srcfile, $dstfile);


The file I used is a medical record, so I am afraid I can't furnish that one for you to look at. If I find another, I'll send it.

Quote from: perlcat on February 16, 2011, 03:08:10 PM
when you run perl in line mode, it appends a line feed to each line of output.

My file is one byte short in length according to StripByteCounts, and this side effect makes it work [facepalm]. So no bug that you have to do anything with.

Sorry if you went to any trouble on this.

I said "can't" because I edited Exif.pm to make this tag writable and it didn't work.  I didn't test to see if it worked with your config file.  If it does, then I would definitely not recommend doing this because it could easily corrupt the image, and would certainly not work in general for all types of images.  But if it works for you, then great. :)

I don't know what you are talking about with disappearing config examples.  If something disappeared from this forum please let me know because I certainly never delete anything except spam.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).