Understanding the htmlDump option

Started by greybeard, November 20, 2024, 09:50:02 AM

Previous topic - Next topic

greybeard

The htmlDump option is potentially one of the most helpful ways of digging into image file metadata and showing the details of the EXIF and JPEG metadata.

When a tag for which the actual value data does not fit within the original 12 byte tag block is selected (highlighted in blue or purple when displayed by a suitable browser) this shows the group, group sequence, tag name, tag ID (in hex), format, size, value and up to 4 offsets (in hex).

Can someone explain the exact meaning of the offsets and how they relate to the value location within the image file?

The offsets are named: Value Offset, Actual Offset, Offset Base and File Offset and the number of offsets included varies depending on which type of image file is being viewed (such as jpg, raf, cr3, arw etc.)

JPG files and Canon cr3 files have Value Offset and File Offset. The Value Offset appears to show the offset from the TIFF header and the File Offset shows the offset from the start of the file.

Sony arw and Leica dng files have a single offset (Value Offset) and this shows the offset from the TIFF header which is also the offset from the start of the file.

Nikon nef files have blue highlighted tags which match the standard used for Leica dng and Sony arw but also have purple tags with Value Offset, Actual Offset and Offset Base. The Actual Offset is the sum of the Value Offset and Offset Base fields and indicates the value location within the file.

FujiFilm RAF files have blue and purple tags and all four offsets: Value Offset, Actual Offset, Offset Base and File Offset but I can't figure out a consistent way of using them to calculate the offset of the value within the image file.

I have only looked at a single image file from each type and haven't yet looked at variations from other cameras.

Is there an algorithm for using these offsets to locate the value within the file regardless of image file format?


StarGeek

Phil will have to respond, but I believe this paragraph from the Problems with current Metadata Standards-Tiff 6.0 is part of it
QuoteA significant problem of the 1992 TIFF 6.0 specification is that there is no way to distinguish an IFD (image file directory) offset from a simple integer value. As a result, new IFD's may not be created without risking corruption of the files by unaware software. This is not only a problem for proprietary maker notes which commonly use a TIFF IFD structure, but is also a problem for extensibility of TIFF-based RAW image formats (as demonstrated by the DNG 1.3 specification -- see below).
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

Quote from: greybeard on November 20, 2024, 09:50:02 AMThe offsets are named: Value Offset, Actual Offset, Offset Base and File Offset and the number of offsets included varies depending on which type of image file is being viewed (such as jpg, raf, cr3, arw etc.)

Offsets are sometimes relative to the start of a section in the metadata (eg. the start of the maker notes).  In these cases, ExifTool gives the stored value of the offset, the base for offset zero, and the actual EXIF-equivalent offset.  If the EXIF isn't at the starf of the file then there is an additional file offset that gives the absolute offset in the file.

This works for JPEG images, but looking at it now I think there may be a problem with the reported offsets for JPEG's embedded in RAF files.  I'll have to look into this.

QuoteNikon nef files have blue highlighted tags...

The colours are explained here.

QuoteIs there an algorithm for using these offsets to locate the value within the file regardless of image file format?

The file offset should give this, but as I said it seems to not be working properly for some types of embedded files.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

greybeard

Thanks - I had missed the colour explanations

Phil Harvey

#4
I have fixed the problem with the incorrect file offsets in the -htmldump output for EXIF of some embedded files, and will release ExifTool 13.04 soon with this patch.

Thanks for bringing this to my attention.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

greybeard

Quote from: Phil Harvey on November 26, 2024, 08:52:24 AMI have fixed the problem with the incorrect file offsets in the -htmldump output for EXIF of some embedded files, and will release ExifTool 13.04 soon with this patch.

Thanks for bringing this to my attention.

- Phil

Thanks again