Olympus raw image PreviewImageStart tag

Started by NoTan2, February 18, 2019, 11:02:04 PM

Previous topic - Next topic

NoTan2

Hi Everyone,

My first post here. I'm currently writing some code as a workaround for an apparent defect either in Microsoft WIC or in Olympus codecs.
I have two Olympus cameras - an E-M5 and and E-P5. Windows Explorer (Windows 10) has always happily displayed thumbnails for the raw images (.ORF) produced by the E-M5 but those from the E-P5 showed as a generic icon. I've always ignored this as the images opened just fine in my post-processing software.

However, it has recently become an issue. I discovered as part of the investigation that if I used a hex/binary editor to change one byte of the file - the Camera Model from "P" to "M" (ie from "E-P5" to "E-M5") then Windows happily displayed thumbnails. Clearly there's nothing "wrong" with the files from the E-P5. Changing it to one of the other Olympus models which have a four character name also resolves the issue. My best guess so far is that Olympus "forgot" about the E-P5 when creating their codec(s).

Anyway to work around this, I'm writing a utility which will extract and display the embedded JPEG Preview image.

Some background. Thirty years or so ago, I worked as an imaging programmer. I was tasked with writing programs to manage and then display images from a variety
of sources - all black and white in those days (mostly text). There were no available image management libraries at the time so I had lots of fun learning
how to decompress and compress CCITT G3 and G32D images which were generally packaged in TIFF. All written in C and assembler.

Back to the present. After poking around in the .ORF image files, it became evident that the EXIF tags bore a strong relationship to the TIFF tags of my previous life. So I wrote some code (C#) to traverse the tags and see what I could find. Eventually I discovered the PreviewImageStart and PreviewImageLength tags. Using the value in the
PreviewImageStart tag as an offset, I wrote out PreviewImageLength bytes to a file and gave it a .jpeg extension. Not a valid JPEG.

Back to the hex editor and started looking for a JPEG magic number and found it about 3k further into the file at offset 0xcc00. The JPEG preview seems to always begin at the same offset in the files from both cameras but they both have different PreviewImageStart values.

I've adjusted the code to start at offset PreviewImageStart and look for the magic numbers and extract from there and it seems to work fine. But I'm obviously unhappy with a "hacked" solution such as this.

At this point, a friend suggested ExifTool which I'd heard of but never investigated. What a magnificent piece of work. I downloaded and asked it to extract the previews from a few images and it worked fine. Then I discovered this forum where people are happy to dig around in the gritty bits of image files.

So, apologies that my request is not strictly about ExifTool but this is the only place I could find where somebody might know the answer.

Is my approach the only way to locate the embedded JPEG preview in an .ORF since the offset tag appears to be incorrect - or am I just misunderstanding its implementation?

Many Thanks,
Paul.

NoTan2

Further testing.
I was thinking that perhaps the PreviewImageStart value might be an offset from a known position in the file rather than from the beginning.
So I subtracted the values in each of the file types from 0xcc00 and found that it pointed in each case to the text "OLYMPUS" followed by 0x00.

Now I need to find what is the significance of this location.

Also noticed after requesting a full tag dump from ExifTool that the PreviewImageStart value is shown as 52224 (0xcc00) which is the actual start of the preview. So Exiftool is adjusting the value before dumping it.

Phil Harvey

Use the exiftool -htmldump feature to look at the structure of the file.  You will see that the PreviewImageStart is relative to the start of the makernotes, which is found in tag 0x927c of the ExifIFD.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

NoTan2

Thank you very much, Phil. That makes sense now.

I'm very impressed with the quality and breadth of the work you've done. A considerable achievement.

Thank you for sharing it all.

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).