ExifTool Forum

General => Metadata => Topic started by: 11august on May 26, 2025, 05:21:39 AM

Title: Tag's content
Post by: 11august on May 26, 2025, 05:21:39 AM
Hi all,

I spend the last three says extracting data of millions of photos for my project and I came up with odd contents for some tags, can you explain what does that mean ? Also maybe you can help me to define content limits. For instance :

- ComponentsConfiguration :

95% are the usual "Y, Cb, Cr, -" but some are with odd writing, and some came up with errors (see attached)

- CompressedBitsPerPixels :

Looks like this tag is always entered as integer or decimal numbers, between 0 and 10, does that sounds correct ?

- ColorComponents :

I only found 1, 3 or 4 as possible values. Does that sounds correct to you ?

- Also, is it correct to assume that the following JPEG SOF and extra tags must always be present in a JPEG file ?

-BitsPerSample -ColorComponents -EncodingProcess -ExifByteOrder -ImageHeight -ImageWidth -YCbCrSubSampling
Thanks for your help !

ComponentsConfiguration.png
Title: Re: Tag's content
Post by: StarGeek on May 26, 2025, 11:35:07 AM
Quote from: 11august on May 26, 2025, 05:21:39 AM- ComponentsConfiguration :

95% are the usual "Y, Cb, Cr, -" but some are with odd writing, and some came up with errors (see attached)

The first thing to check would be that they are all JPEGs and not some other file type that has a .jpg extension.

But you need to remember that ComponentsConfiguration is just an EXIF tag, and it can be written with anything you want or even removed entirely from a JPEG.

C:\>exiftool -P -overwrite_original -ComponentsConfiguration#="42 21 33 99" y:\!temp\Test4.jpg
    1 image files updated

C:\>exiftool -G1 -a -s -ComponentsConfiguration y:\!temp\Test4.jpg
[ExifIFD]       ComponentsConfiguration         : Err (42), Err (21), Err (33), Err (99)

C:\>exiftool -P -overwrite_original -ComponentsConfiguration= y:\!temp\Test4.jpg
    1 image files updated

C:\>exiftool -G1 -a -s -ComponentsConfiguration y:\!temp\Test4.jpg

C:\>

Badly written software that doesn't follow the specs exists, so there's never a guarantee. Also, this is considered a Mandatory tag by the EXIF standard, so if exiftool is creating an EXIF block where one didn't exist, it will create these tags with some default values. See Phil's comments on mandatory tags (https://exiftool.org/writing.html#Mandatory).


Quote- CompressedBitsPerPixels :

Looks like this tag is always entered as integer or decimal numbers, between 0 and 10, does that sounds correct ?

Same as above
C:\>exiftool -P -overwrite_original -CompressedBitsPerPixel=9001 y:\!temp\Test4.jpg
    1 image files updated

C:\>exiftool -G1 -a -s -CompressedBitsPerPixel y:\!temp\Test4.jpg
[ExifIFD]       CompressedBitsPerPixel          : 9001

Quote- ColorComponents :

I only found 1, 3 or 4 as possible values. Does that sounds correct to you ?

A JPEG with a ColorComponents value of 1 would be a grayscale JPEG. Not a JPEG that looks like it is gray scale, one that has been saved without any color components at all. The jpegtran man page (https://linux.die.net/man/1/jpegtran) has a good description of it.
Quote-grayscale
    Force grayscale output.
This option discards the chrominance channels if the input image is YCbCr
    (i.e., a standard color JPEG), resulting in a grayscale JPEG file. The luminance channel is preserved exactly, so this is a better method of reducing to grayscale than decompression, conversion, and recompression. This switch is particularly handy for fixing a monochrome picture that was mistakenly encoded as a color JPEG. (In such a case, the space savings from getting rid of the near-empty chroma channels won't be large; but the decoding time for a grayscale JPEG is substantially less than that for a color JPEG.)

A value of 3 would be a standard color JPEG.

I've never seen a JPEG with a value of 4. That sounds like a JPEG with an alpha channel, which isn't something I've heard of before. This is something I'd be interested in seeing if it was an actual JPEG and not a similar file type (jpeg2000, jpxl, etc).

I don't believe this tag exists for files other than JPEG type files, though I could be wrong.

Quote- Also, is it correct to assume that the following JPEG SOF and extra tags must always be present in a JPEG file ?

-BitsPerSample -ColorComponents -EncodingProcess -ExifByteOrder -ImageHeight -ImageWidth -YCbCrSubSampling

ExifByteOrder only exists in files that have an EXIF block.
C:\>exiftool -G1 -a -s -EXIF:All -ExifByteOrder y:\!temp\Test4.jpg
[ExifIFD]       DateTimeOriginal                : 2025:05:26 12:00:00
[File]          ExifByteOrder                   : Big-endian (Motorola, MM)

C:\>exiftool -P -overwrite_original -DateTimeOriginal= y:\!temp\Test4.jpg
    1 image files updated

C:\>exiftool -G1 -a -s -EXIF:All -ExifByteOrder y:\!temp\Test4.jpg

C:\>

But the others are properties of a JPEG file, so they should exist in JPEGs.
Title: Re: Tag's content
Post by: 11august on May 26, 2025, 12:57:18 PM
Thanks StarGeek again for this very complete and useful reply.

Sorry, I think that I wasn't clear on my purpose, which is for authentication, meaning that I'm looking for tags that should always be there (as by EXIF specification) if no third-party software involved in removing it. For what I understand, looks like very easy to remove any EXIF tag using Exiftool without leaving any trace, am I right ?

In that sense, and if I understand you correctly, the only native ComponentsConfiguration "Y, Cb, Cr, -" tag is written this way ? I mean, any other written ways ("Y", "-, -", etc. for example) are "badly written software that doesn't follow the specs" ? Same remark for "CompressedBitsPerPixels".


QuoteI've never seen a JPEG with a value of 4. That sounds like a JPEG with an alpha channel, which isn't something I've heard of before. This is something I'd be interested in seeing if it was an actual JPEG and not a similar file type (jpeg2000, jpxl, etc).

I indeed work solely on jpeg files, that's why I use, before any Exiftool extraction, BadPeggy software which is able to detect any form of corrupted jpeg file and remove them in batch.

After this, I came up with around very few files on more than 1 million extracted noted as "4" for "ColorComponents" tag, which is indeed very rare. I also noticed that all these files as marked under PS in the Software tag and some are watermarked, which could be good indications of what's been done.

I can send you such a photo, but I'm not sure about the best way to do it here without removing the metadata.

Also, is the presence of "R03 - DCF option file (Adobe RGB)" in the InteropIndex tag always the sign of the use of Adobe postprocess? Shouldn't all unmodified JPEG files marked as "R98 - DCF basic file (sRGB)" ?
Title: Re: Tag's content
Post by: StarGeek on May 26, 2025, 07:09:10 PM
Quote from: 11august on May 26, 2025, 12:57:18 PMFor what I understand, looks like very easy to remove any EXIF tag using Exiftool without leaving any trace, am I right ?

There's the simple version and the "Well, technically" version. Mostly, it's quite easy to remove metadata with numerous programs/websites and any changes will remain undetectable to casual scrutiny. But different code will write the metadata in different ways. The way exiftool writes metadata will be slightly different from the way Adobe writes the metadata, which will be different from a Nikon camera does, which is different from a Sony camera. The padding bytes might be different, maybe the order of the data is different when that is an option. For example, camera may write metadata with a lot of extra padding so that they can copy the data into the fields and dump it all at once rather than build up the data piece by piece. I'm explaining this poorly, but FAQ #13, Why is my file smaller after I use ExifTool to write information? (https://exiftool.org/faq.html#Q13) is more detailed.

This is more in the realm of digital forensics.

QuoteIn that sense, and if I understand you correctly, the only native ComponentsConfiguration "Y, Cb, Cr, -" tag is written this way ? I mean, any other written ways ("Y", "-, -", etc. for example) are "badly written software that doesn't follow the specs" ? Same remark for "CompressedBitsPerPixels".

I honestly don't know. I don't know anything about the various color spaces and what they mean or how they're used.

You might want to take a look at the EXIF standard (https://www.cipa.jp/std/documents/download_e.html?CIPA_DC-008-2024-E) for more details. Though they can be vague and Phil has called them poorly written.

QuoteI indeed work solely on jpeg files, that's why I use, before any Exiftool extraction, BadPeggy software which is able to detect any form of corrupted jpeg file and remove them in batch.

I never heard of that, and wish I had found it years ago when I was collecting more images off the web and trying to find corrupt ones.

QuoteAfter this, I came up with around very few files on more than 1 million extracted noted as "4" for "ColorComponents" tag, which is indeed very rare.
...
I can send you such a photo, but I'm not sure about the best way to do it here without removing the metadata.

If privacy is a concern, then don't worry about it. But one way would be to saving it to a site like Dropbox/GoogleDrive/Mega.nz and send me a link in the forum's DMs.

QuoteAlso, is the presence of "R03 - DCF option file (Adobe RGB)" in the InteropIndex tag always the sign of the use of Adobe postprocess? Shouldn't all unmodified JPEG files marked as "R98 - DCF basic file (sRGB)" ?

No idea. I don't use any Adobe products and haven't tested anything like that.
Title: Re: Tag's content
Post by: 11august on May 29, 2025, 04:58:07 AM
Hi StarGeek,

Thanks again for your help. I send you a PM with a link to an image with the ColorComponents tag written as 4.

About the ComponentsConfiguration "Y, Cb, Cr, -" tag, ChatGPT answered to me this way :

QuoteQ1: Can the "Y, Cb, Cr, -" value be considered the only viable one?
No, it's not the only viable one, but it's the standard value for JPEG images in YCbCr—the most common encoding format for digital cameras and smartphones.

Other sequences are technically possible, including:

- For RGB images: [4, 5, 6, 0] → "R, G, B, -"

- Some poorly configured devices or software may produce atypical or invalid sequences.

Q2: Are all other values indicative of corruption?
No, not necessarily.

They may indicate:

- An exotic hardware configuration

- A poorly executed conversion (e.g., export from image processing software)

- A software encoding error (sometimes benign)

- A manual modification or a forgery

However, cases like "Cb, Y, Cr, Y" or "-, -, -, Y" are highly suspicious:

- They do not follow the standard order or color interpretation conventions.

- This type of anomaly may suggest corruption, tampering, or a third-party tool that has mismanaged the EXIF.
Title: Re: Tag's content
Post by: StarGeek on May 29, 2025, 12:16:05 PM
Quote from: 11august on May 29, 2025, 04:58:07 AMThanks again for your help. I send you a PM with a link to an image with the ColorComponents tag written as 4.

Got it. Thanks.

This is actually a CMYK JPEG, not a more standard RGB JPEG.

CMYK shows up in ICC-header:ColorSpaceData, but if you strip away the ICC_Profile, then it doesn't look like exiftool can see the difference between CMYK and RGP except for the 4 in ColorComponents.