Hi
i am writing a perl script that exports image metadata to a text file.
The file will contain both the exif value and the printed value.
Basically i iterate the groups and the tags, and get the value and print them (currenty to screen, not yet to the actual file):
my @valueList = $exifTool->GetValue($tag, 'Both');
my $val = $valueList[0];
my $valPrintable = $valueList[1];
print $val;
print $valPrintable;
Everything works fine, but i am finding some problems in some values.
For instance, with CFAPattern, printing the value inserts some extrange characters that cannot be decoded.
If i use exiftool, i see that this value is printed as "....."
The same happens, for instance with IPTC:CodedCharacterSet. In exiftool it's printed as ".%G" but in my script, the first character cannot be decoded properly.
Is there any way in the script to detect that the value cannot be printed and to substitute it with a valid printable string, in the same way as exiftool does?
best regards
You could do something like this to translate control characters to Perl-like escape sequences:
$val =~ s/([\0-\x1f])/sprintf('\\x%.2x',ord $1)/eg;
The specific problem with CFAPattern is that the value is a binary block of 4 bytes. It may make more sense if I translated this to numbers with the -n option. I'll look into this.
- Phil
With that change the non printable characters are converted and everything works fine.
thanks a lot for your help.
I really appreciate it
EDIT: It seems that after some more testing, i may have found this problem with other fields.
For instance, some Pentax_XXX tags inside DNG files do produce the same encoding error, even with that line.
Anyway, now that you've told me the way to do it, i will try to find a regular expression to handle more of this cases.
In fact, i am not using those binary fields. Maybe there is a way to detect if the fields are non printable, and skip them?
best regards
Well, using exiftool to test this problems i am having with the encoding, i just found out that exiftool can write the output in JSON format. :o
i should read the man page more often :)
My script creates a json file with the metadata structure, so i think that this problem should be solved already in the source code.
i will look into it to see if i can find it out. Also i think i can play with the "Escape" option.
by the way, thanks a lot for your great tool and your help, Phil
best regards
Yes, there are a few tricks with JSON encoding, but I think the -json option should take care of this.
- Phil
the problem is that i am using the library directly, and not the exiftool. So, i cannot use the -json flag.
But i am reviewing the exiftool code, i am finding it very useful.
I think i can reuse some of the encoding routines you have there
best regards.
Quote from: nestochi on June 30, 2011, 07:53:43 AM
the problem is that i am using the library directly, and not the exiftool. So, i cannot use the -json flag.
Right. I forgot what section of the forum we were in.
Quote
But i am reviewing the exiftool code, i am finding it very useful.
I think i can reuse some of the encoding routines you have there
Excellent.
- Phil
Well, as i expected after reading exiftool manpage, everything was in there from the beginning :-)
Now i invoke EscapeJSON($;$) subroutine before writing the value to the file, and everythink works great.
The unprintable characters are gone, even with some weird unknown binary fields.
thanks again for your time ...
best regards