Strange metadata issue in JPG EXIF comment

Started by mschiff, January 09, 2017, 04:39:31 PM

Previous topic - Next topic

mschiff

I am storing information from an image management application that I wrote (i.e. date, category, people in the photo, etc.) in the EXIF comment of JPG images. When those images are viewed in the Editplus text editor, some of the images show the date formatted in the way it is stored (i.e. with linefeeds so that the data looks formatted), and other images have the same data stored, but it does not show up formatted in Editplus. If I view the file in Word (not as an image, but as characters), all of the files format correctly. I am including a dropbox link to two JPG images that demonstrate this. The one ending in 43 does not format, and the one ending in 44 does.

This is the file that does not format:
https://dl.dropboxusercontent.com/u/70781678/2015-03-31_0043.JPG

And this one does:
https://dl.dropboxusercontent.com/u/70781678/2015-03-31_0044.JPG

Does anyone have an idea what might cause this?

TIA

-- Martin

StarGeek

It would have to be with the way EditPlus is reading the file.  When I compare the EXIF:UserComment of the two files, the format is the same.  Tags, new lines, spaces are all the same.  The only real difference I see is that the first file has a Jpeg Comment (LEAD Technologies Inc. V1.01) in it as well.  You could try removing that, as it doesn't add anything of use to the file data.

C:\>exiftool -usercomment -json X:\!temp\aa
[{
  "SourceFile": "X:/!temp/aa/2015-03-31_0043.JPG",
  "UserComment": "\n<Photodex Data>\n<DATE> 03-31-2015\n<CITY> Petaluma\n<STATE> California\n<COUNTRY> United States\n<MEDIUM> digital photo\n\n<PEOPLE>\n| Joel Shock\n\n<CATEGORIES>\n| parrots\n| Stadler Lane\n\n<COMMENT> \nparr\n\n<Photodex ID Codes>\n870372YW0T 02250ZLRPM 0000000002 USA        22548LV3AS 26534QBXEU 1000000088 70305ZIXWH \n\n</Photodex Data>\n"
},
{
  "SourceFile": "X:/!temp/aa/2015-03-31_0044.JPG",
  "UserComment": "\n<Photodex Data>\n<DATE> 03-31-2015\n<CITY> Petaluma\n<STATE> California\n<COUNTRY> United States\n<MEDIUM> digital photo\n\n<PEOPLE>\n| Joel Shock\n\n<CATEGORIES>\n| parrots\n| Stadler Lane\n\n<COMMENT> \nparr\n\n<Photodex ID Codes>\n876462YWHQ 02250ZLRPM 0000000002 USA        22548LV3AS 26534QBXEU 1000000088 70305ZIXWH \n\n</Photodex Data>\n"
}]
    1 directories scanned
    2 image files read


Targeted removal:
exiftool -file:comment-="LEAD Technologies Inc. V1.01" FileOrDir
or just remove all the jpeg comments
exiftool -file:comment= FileOrDir

Jpeg comments rarely have useful data (though I have seen it done) or are a safe place to store data, as many programs will overwrite it without warning.

Your best bet, though, is to try an image viewer or manager that will properly read and display metadata.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

mschiff

Thanks for your reply. I did go ahead and remove the Lead Technologies comment, but it did not make any difference in the way the file displays.

Yes, I deliberately put the same information in both files, so that would eliminate any difference in my data (except for the key for the file itself which is slightly different).

The issue seems to have some relationship to the byte string at offset F2350 (hex), ending in E3 0F 0D. If I copy the data from the file that does not format into the file that does ending with the 0D, it causes the good file also to not format the way I want it to. Of course, it also trashes the file, but it is interesting that if I copy just a byte less, the file still formats correctly.

The person that I work for wants to be able to load a JPG into a text editor and see the formatted data like this:



<Photodex Data>
<DATE> 03-31-2015
<CITY> Petaluma
<STATE> California
<COUNTRY> United States
<MEDIUM> digital photo

<PEOPLE>
| Joel Shock

<CATEGORIES>
| parrots
| Stadler Lane

<COMMENT>
Parrots are cool.
Second line of comment.
Third line of comment.

<Photodex ID Codes>
876462YWHQ 02250ZLRPM 0000000002 USA        22548LV3AS 26534QBXEU 1000000088 70305ZIXWH

</Photodex Data>



-- Martin

PS I modified the <comment> from the files that I uploaded as an example.

Phil Harvey

#3
Hi Martin,

Offset 0xf2350 is in the JPEG image data.  The fact that the 0x0d data value seems significant may indicate that EditPlus is using some algorithm to try to determine the linefeed type of the file, but is incorrectly parsing binary data for this purpose.  Is EditPlus a text editor?  If you are opening a binary file as a text file with a text editor I would expect something like this to happen.

- Phil

Edit:  Yes.  Having seen StarGeek's response and now having read your posts more thoroughly I see that this is the problem.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: mschiff on January 10, 2017, 10:13:14 AM
The person that I work for wants to be able to load a JPG into a text editor and see the formatted data like this:

I doubt you're going to be able to make this happen consistently.  A text editor simply put, is the the wrong tool for this process.

I would suggest doing this.  Drag a copy of exiftool onto their desktop.  Rename it to exiftool(-usercomment -b -k).exe.  They can now drag and drop files onto exiftool and they will see the data without all the other junk.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

mschiff

#5
Stargeek, I agree that a text editor is the wrong tool for this process, but nevertheless it is what he wants. Yes, Phil, Editplus is a text editor that I use for programming.

His viewing of the data is not at issue. He can and does view it in the application that I wrote, where the metadata is stored in a database. He can display the pictures, and search, view or modify the metadata at will, and then click a button to store it in the image.

His reasoning is that in the future, someone will have the JPG images, but not the program that he uses to catalog and search them, and he therefore wants the metadata to be stored in the image itself so that it is obvious to anyone that finds the images.

He wanted me to find experts in the JPG file format, and the EXIF, and this was the place I figured that I would find the most knowledge.

Thanks.

-- Martin

StarGeek

Quote from: mschiff on January 10, 2017, 11:49:52 AM
His reasoning is that in the future, someone will have the JPG images, but not the program that he uses to catalog and search them, and he therefore wants the metadata to be stored in the image itself so that it is obvious to anyone that finds the images.

His reasoning about the metadata is good, but his idea that someone is going to load up an image file in a text editor is... extremely unsound, to put it politely. 

I would put the number of people who would do this in the low single digit percentage, and that's being generous.  This requires to use a tool for a different purpose than it's intended, sift through a lot of static for something they don't know is there, because if they were looking for that data, they'd use a tool that could actually find that data.

One more additional point is that your boss assumes that they're going to open it up in his preferred editor.  If they open it up in Notepad (Windows default text editor), nobody will see the data at all. It will be off  the screen to the right, as Notepad doesn't display the new lines.  Notepad++ on the other hand, will show the info as he wants it on both images.  And you mention Word will show the raw data fine.  He's expecting you to work around a quirk of a single program.

I understand you're in a difficult position, but I don't believe there's anything I can do to help.  The data shows up fine under Windows properties and in Lightroom. Your boss is focusing on something inconsequential that a single program does when using it incorrectly. 
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

mschiff

StarGeek,

Thanks for the idea about using the properties in Windows Explorer. The comment is displayed perfectly there, and he is satisfied with that.

-- Martin

StarGeek

"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype