Repairing files modified by Windows Photo Viewer?

Started by InTheWorks, December 15, 2013, 11:08:39 PM

Previous topic - Next topic

InTheWorks

I did something stupid and now I could really use some expert help.  I hit the rotate button in Windows 7 photo viewer while browsing images on my sd card.  Now the picture is permanently rotated and the camera no longer previews the image properly.  Why they would write software that allows this is beyond me.

So I am looking to find out how to "fix" this image so that it previews in the camera again.  The camera in question is a Canon Rebel Xsi.  I'm doing this on Ubuntu (10.04).  My naive approach went like this:

To undo the rotation in a lossless fashion I used jpegtran:
jpegtran -copy all -rotate 90 -perfect -outfile out.jpg in.jpg

I used the nautilus-image-converter plugin to create a thumbnail of the image correctly rotated:
convert out.JPG -resize 160x120 out-thumb.jpg

and then I used exiftool to insert the thumbnail:
exiftool "-thumbnailimage<=out-thumb.jpg" out.jpg

And this didn't fix the problem.  So I looked a little deeper and the orginal camera images start with:

ff d8 ff e1 (Exif II)

and the one modified by windows photo viewer starts with:

ff d8 ff e0 (JFIF)

This stuff is way above my head.  So off to google I went and the only topic I could find about this (which exactly matches my problem) doesn't have a resolution:

http://superuser.com/questions/523338/how-to-convert-jpeg-jfif-files-to-jpeg-exif-format

When I try the command from the above thread:

exiftool "-exif:all<jfif:all" "-thumbnailimage<jfif:thumbnailimage" out.jpg

The output file still starts with JFIF.

Any ideas?  Is it possible to revert this file back to the Canon format complete with all tags (that were not lost by windows)?

Thanks for any help.


Phil Harvey

To remove the JFIF information, do this:

exiftool -jfif:all= FILE

You should also probably run a controlled test to see what else may have changed.  Run this command before and after rotating with Windows, and compare the outputs:

exiftool -a -G1 FILE

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

InTheWorks

Quote from: Phil Harvey on December 16, 2013, 08:13:43 AM
To remove the JFIF information, do this:

exiftool -jfif:all= FILE


This worked, thanks.   If I compare this file to a unaltered camera file there is a difference in the exif header.

exiftool output from altered file JFIF:

00000000  ff d8 ff e1 49 b0 45 78  69 66 00 00 4d 4d 00 2a  |....I.Exif..MM.*|


unaltered:

00000000  ff d8 ff e1 48 b8 45 78  69 66 00 00 49 49 2a 00  |....H.Exif..II*.|


Can I make the output of exiftool match the unaltered file?

Quote
You should also probably run a controlled test to see what else may have changed.  Run this command before and after rotating with Windows, and compare the outputs:

exiftool -a -G1 FILE

That is a great idea, thanks!  I'll try that out when I get a chance.

Phil Harvey

Wow.  I have always strongly advised against changing the byte order of EXIF information.  Apparently Windows has done this since the EXIF in your file changed from II (little-endian) to MM (big-endian).  Bummer.  In general, there is no way to recover 100% after something has tried to change the byte order.  BTW, this goes against the MWG recommedation, which states:

        Exif metadata is formatted as a TIFF stream, even in JPEG files.  TIFF streams have
        an explicit indication of being big endian or little endian.  A Changer SHOULD
        preserve the existing byte-order.


This is ironic because Microsoft is a member of the MWG group.

It is possible to use ExifTool to change the byte ordering, but as I said, I don't recommend it:

exiftool -all= -tagsfromfile @ -all:all -unsafe -exifbyteorder=II FILE

This is a variation of the command described in FAQ number 20.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

InTheWorks

Thanks again for the tip.  It appears to have changed the byte order back.

I compared a file before and after rotation/save with Windows Photo Viewer using:

exiftool -a -G1 FILE

Apparently changing the byte order is not all they do.  They even changed the Y Cb Cr sub sampling from:

[File]          Y Cb Cr Sub Sampling            : YCbCr4:2:2 (2 1)

to

[File]          Y Cb Cr Sub Sampling            : YCbCr4:4:0 (1 2)

I presume that I will have to reencode the file with the proper sub sampling for my camera to see it?  The file sizes are practically identical though.

They were nice enough to leave incriminating evidence:

[File]          Exif Byte Order                 : Big-endian (Motorola, MM)
[IFD0]          Software                        : Microsoft Windows Photo Viewer 6.1.7600.16385
[XMP-rdf]       About                           : uuid:faf5bdd5-ba3d-11da-ad31-d33d75182f1b
[XMP-xmp]       Creator Tool                    : Microsoft Windows Photo Viewer 6.1.7600.16385


Is the thumbnail offset a problem?  From the camera file its:

[IFD1]          Thumbnail Offset                : 9660
[IFD1]          Thumbnail Length                : 7643
[Composite]     Thumbnail Image                 : (Binary data 7643 bytes, use -b option to extract)

and in the windows photo viewer rotated version it's:

[IFD0]          Padding                         : (Binary data 2060 bytes, use -b option to extract)
[IFD1]          Thumbnail Offset                : 13890
[IFD1]          Thumbnail Length                : 4702
[Composite]     Thumbnail Image                 : (Binary data 4702 bytes, use -b option to extract)

Otherwise the tags look pretty similar.

I realise I can't recover the file exactly, but I was hoping I could put it back onto the camera in a readable state just the same.  Mostly I'd like to understand/document the evil they did in its entirety.

Phil Harvey

Quote from: InTheWorks on December 18, 2013, 02:38:20 AM
I presume that I will have to reencode the file with the proper sub sampling for my camera to see it?

Yes.  Some cameras do not display images with different sub-sampling.

QuoteIs the thumbnail offset a problem?

No.  This is mentioned in FAQ 13.

QuoteMostly I'd like to understand/document the evil they did in its entirety.

Good idea.  I hope you post this somewhere to help more people become aware of this problem.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

InTheWorks

#6
Quote from: Phil Harvey on December 18, 2013, 07:40:15 AM
Quote from: InTheWorks on December 18, 2013, 02:38:20 AM
Is the thumbnail offset a problem?

No.  This is mentioned in FAQ 13.

Ok, I haven't tested this file in the camera yet, but, all the same, is there a way to remove the padding?

exiftool isn't changing the byte order any longer using:

exiftool -all= -tagsfromfile @ -all:all -unsafe -exifbyteorder=II out.jpg
    1 image files updated


I'm not sure why.  I've tried it before/after each step of the "repair process" and the byte order remains unchanged.  Any ideas?

PH Edit: fixed quoting

Phil Harvey

Quote from: InTheWorks on December 19, 2013, 05:21:04 AM
is there a way to remove the padding?

The command to change byte order should do this.  Alternatively, -padding= should also do it.

Quoteexiftool isn't changing the byte order any longer using:

exiftool -all= -tagsfromfile @ -all:all -unsafe -exifbyteorder=II out.jpg
    1 image files updated


I'm not sure why.  I've tried it before/after each step of the "repair process" and the byte order remains unchanged.  Any ideas?

No idea.  This should work unless you are using a really old version of ExifTool.  Send me the image and I'll take a look.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

InTheWorks

Quote
Quoteexiftool isn't changing the byte order any longer using:

exiftool -all= -tagsfromfile @ -all:all -unsafe -exifbyteorder=II out.jpg
    1 image files updated


I'm not sure why.  I've tried it before/after each step of the "repair process" and the byte order remains unchanged.  Any ideas?

No idea.  This should work unless you are using a really old version of ExifTool.  Send me the image and I'll take a look.

I was just about to send you the image when I thought maybe I should check the version.  As I mentioned I'm using Ubuntu 10.04 with whatever exiftool was in the respository.  That version happens to be 7.89.  So I downloaded 9.44 and now the byte order is working.  I should have checked this earlier, but thanks for offering to have a look nonetheless.  And thanks also for all the help and for creating exiftool which is a pretty wicked program.

So either it never worked with 7.89 and I only imagined it, or for some reason it stopped working.  But it works in 9.44.

Continuing on with this experiment...

Microsoft inserts the following extra tags:

[IFD0]          Software                        : Microsoft Windows Photo Viewer 6.1.7600.16385
[ExifIFD]       Offset Schema                   : 4210
[XMP-x]         XMP Toolkit                     : Image::ExifTool 9.44
[XMP-xmp]       Creator Tool                    : Microsoft Windows Photo Viewer 6.1.7600.16385

and can be removed with:

exiftool "-software=" "-offsetschema=" "-xmptoolkit=" "-creatortool=" out.jpg

Is the Offset Schema important?

Next, I would like to set the Modify Date equal to the "creation date"  which is easily done with:

exiftool "-modifydate<datetimeoriginal" out.jpg

though I should maybe leave it alone so there is some indication the file was tampered with.  Similarly, I'd like to correct the file's date, which I think can be done with:

exiftool "-filemodifydate<datetimeoriginal" out.jpg


However, time on the original file was 22:33, but the time after exiftool corrects it is 21:33.  1 hour off.

-rwxr-xr-x 1 me me 5.9M 2008-06-14 22:33 IMG_0076.jpg
-rwxr-xr-x 1 me me 5.9M 2008-06-14 21:33 out.jpg


Any idea why?

I also discovered that the change in byte order was also changing the yuv subsampling and the ffmpeg resampling step is not necessary.  I see no difference in jpg when viewed.  So I'm wondering if the exif data was simply displayed wrong?  Using imagemagick's identify:

original (little endian)

    jpeg:sampling-factor: 2x1,1x1,1x1

windows rotated (big endian)

    jpeg:sampling-factor: 1x2,1x1,1x1


Is it feasible that the sampling factor is multibyte (and affected by byte order)?

Now I've managed to almost completely revert the image back to how it was before windows touched it (accompanied with some unavoidable loss in quality) except for the thumbnail.  I made a mistake before.  The convert command comes from imagemagick not nautilus-image-converter.

My earlier attempt at convert created a thumbnail that was 160x107.   Looking at the camera thumbnail, the image size is 160x120, but it's also letter boxed.  Reading up a bit on imagemagick's convert command, a better thumbnail can be generated with:

convert out.jpg -thumbnail '160x120>' -background transparent \
                -gravity center -extent 160x120 -quality 91 out-thumb.jpg


The resulting thumbnail is not exactly like the one the camera made (vertically offset by 1 pixel), but it's really close and I don't know how to do any better.

The bad news is that the fixed file does not preview in the camera so the thumbnail aspect is broken.

Which has me wondering about the thumbnail offset.  I would expect that now that all the tags are the same, the thumbnail would start at the same place, but it doesn't.  Meaning some extra "data" once in the original is missing?  I don't see any differences with

exiftool -a -G1 FILE

except for the offet now.  Original:

[IFD1]          Thumbnail Offset                : 9660
[IFD1]          Thumbnail Length                : 7643

"fixed":
[IFD1]          Thumbnail Offset                : 8882
[IFD1]          Thumbnail Length                : 7423


Any insights on why the thumbnail in the original file might start later, or why the camera can't show the thumbnail?  Does the camera expect the thumbnail to be at some specific offset?  This is a Canon Rebel Xsi.

InTheWorks

Is there any way to save the output of htmlDump into "formatted" text?  I tried saving the webpage as text, but that didn't work.

I tried to diff the output hexdump -C  (29MB each file), but without cutting out the JPEG portion of the file it's unusable.

Phil Harvey

Quote from: InTheWorks on December 19, 2013, 05:48:41 PM
Is the Offset Schema important?

No.  It is an ill-conceived tag created by Microsoft.

[/quote]
However, time on the original file was 22:33, but the time after exiftool corrects it is 21:33.  1 hour off.
[/quote]

This is a know bug in Windows

QuoteIs it feasible that the sampling factor is multibyte (and affected by byte order)?

I don't think so.

QuoteAny insights on why the thumbnail in the original file might start later, or why the camera can't show the thumbnail?  Does the camera expect the thumbnail to be at some specific offset?  This is a Canon Rebel Xsi.

Use the -htmlDump option to see the full layout of the EXIF information.  But I can tell you right now that the thumbnail offset won't matter.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).