JPEG Start of Scan (SOS) Marker

Started by xpsd300, April 22, 2019, 05:59:05 PM

Previous topic - Next topic

xpsd300

Hi,

A publisher is asking us to provide the JPEG compression ratio for our photos.

From what I've read, this is a ratio of the maximum uncompressed size (Photoshop pixel dimensions) to the compressed file size.

I'm using the following calculation to get the maximum uncompressed size:

((document width * DPI) * (document height * DPI) * (channels * bits)) / 8

So for a 2 x 2.5 in. printed image at 300 DPI (24-bit), this computes to approx. 1.29 MB.

For the compressed size, I'm using the file size (not size on disk). For example, 200 KB.

Using these two values, I'm getting a ratio of approx. 6.6:1

Our images contain a lot a metadata for version control, so I know that this is skewing the values.

Is there a way with Exiftool to extract just the compressed image data from a file, or to identify where the SOS marker begins?

Thanks.

StarGeek

I don't believe there's any direct way to get that info with exiftool.

You could use a temp file and do
exiftool -o Temp.jpg -all Input.jpg
which would create a temp file stripped of metadata, and then get the filesize of Temp.jpg.

Or, if you're on Unix (mac?), you could use
exiftool -o - -all= y:\!temp\Test4.jpg | wc -c
and the result would be the number of bytes (I think, not a unix command I've ever used before).

Off hand I can't find a pure Windows command that would work.  There is the Powershell Measure-Object -Character but since powershell will corrupt a binary pipe by converting it to ansi, you'd probably lose characters and get an inaccurate reading.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

StarGeek

You might take a look at JpegSnoop.  It normally runs as a gui, but you can run it on the command line with something like
jpegsnoop -i Input.jpg  -scan -o Output.txt

"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Phil Harvey

#3
Using the position of the SOS marker won't give you the whole picture because there is often metadata after the EOI (MPF metadata for example).  Also, I would include some other required JPEG segments (like SOF, DHT, etc) in the image size.  It would probably be best just to just drop the APP segments and anything after the EOI from the calculation.  I could maybe add a JPEG "MetadataSize" tag to help you here.  Let me think about this.

- Phil

Edit: No.  Adding this tag would slow down processing because ExifTool doesn't read to the EOI unless necessary, so information about the trailer size isn't generally known to ExifTool.

Edit2: ...or I could only calculate this tag if requested.  I'll try adding a JPEGImageLength tag like this in ExifTool 11.38.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

xpsd300

@StarGeek - Thanks for pointing out JpegSnoop. I get the following when I run a scan on one of our images:

*** Decoding SCAN Data ***
  OFFSET: 0x00007BF9
  Scan Decode Mode: No IDCT (DC only)
    NOTE: Low-resolution DC component shown. Can decode full-res with [Options->Scan Segment->Full IDCT]

  Scan Data encountered marker   0xFFD9 @ 0x0003E589.0

  Compression stats:
    Compression Ratio:  6.04:1
    Bits per pixel:     3.98:1



@Phil - From this marker, can you tell how the compression ratio is calculated? That appears to be exactly what the publisher is looking for. Is it something you'd be able to add to ExifTool?

The CSV output option of your tool has been a tremendous help to the 1000 images we're working on. Thanks.

StarGeek

Looking at the source code for JpegSnoop, I can find this bit in ImgDecode.cpp, in case that might help out deciding if it's the data you want.  It's a bit beyond me, but it seems to me that it doesn't include the dpi calculation in your first post. 

// Report Compression stats
// TODO: Should we use m_nNumSofComps?
strTmp.Format(_T("  Compression stats:"));
m_pLog->AddLine(strTmp);
float nCompressionRatio = (float)(m_nDimX*m_nDimY*m_nNumSosComps*8) / (float)((m_anScanBuffPtr_pos[0]-m_nScanBuffPtr_first)*8);
strTmp.Format(_T("    Compression Ratio: %5.2f:1"),nCompressionRatio);
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

xpsd300

Thanks for posting the source code. Looking before and after that snippet, and looking at the -scan output, it appears that the m_nDimX and m_nDimY values are in pixels.

The height and width dimensions I posted before are the "document size" values in Photoshop which appear to be equivalent when multiplied by DPI. For example:

Image Size Width (px) = Document Size Width (in) * DPI
Image Size Height (px) = Document Size Height (in) * DPI

Phil Harvey

With ExifTool 11.38 (just released), the attached config file will generate a Composite JPEGCompression tag for you, with a command like this:

exiftool -config jpeg_compression.config -jpegcompression FILE

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

xpsd300

@Phil - Thanks so much for doing this so quickly. It works great!

I've expanded the config file to also get the Bits Per Pixel:

%Image::ExifTool::UserDefined = (
    'Image::ExifTool::Composite' => {
        JPEGCompression => {
            Require => {
                0 => 'JPEGImageLength',
                1 => 'ImageWidth',
                2 => 'ImageHeight',
                3 => 'BitsPerSample',
                4 => 'ColorComponents',
            },
            ValueConv => '(($val[1]*$val[2]*$val[3]*$val[4])/8) / $val[0]',
            PrintConv => 'sprintf("%.2f:1",$val)',
        },

        JPEGBitsPerPixel => {
            Require => {
                0 => 'JPEGImageLength',
                1 => 'ImageWidth',
                2 => 'ImageHeight',
            },
            ValueConv => '$val[0] / (($val[1]*$val[2])/8)',
            PrintConv => 'sprintf("%.2f:1",$val)',
        },
    },
);

%Image::ExifTool::UserDefined::Options = (
    RequestTags => 'JPEGImageLength'
);

1; #end

Phil Harvey

Nice.  Happy to have helped.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).