JPEG: Covering all common metadata tags for information I track

Started by gponym, May 11, 2018, 08:32:28 AM

Previous topic - Next topic

gponym

-   I am looking for help here to put together a good list of the common JPEG metadata tags that correspond to the small set of information which interests me (see below).  For this post, confine the scope to a) new digital camera images and b) scans of non-digital photographs and slides.

    Not knowing a better shorthand term, I use "exif tags" to refer to the various current standard tags and tag families used in JPEG files, including but not limited to XMP tags.

-   I've really no idea which metadata tags are rarely or commonly used in the wild by cameras, other equipment and software out there.  I hope you can help steer me to these tags in common use for the information I wish to track.

-   Note:  I'm in non-US timezone, and my schedule is not mine to set.  Please understand if replies are sometimes delayed.

Thanks.  Gordon

-   Should exif tags be added or removed below to come up with a list of in-the-wild commonly used exif tags?

    (Where I see candidate tags, they are listed alphabetically, using Exiftool syntax if I know it.  I've tried to put together a decent starting list but I'm new to exif tags so please point out mistakes and omissions.)


    -   Album
        -   XMP:Album
        -   XMP-xmpDM:Album

    -   Caption
        -   (superfluous?  duplicates function of Short Title or Title?)
        -   IPTC:Caption-Abstract
        -   IPTC:Headline
        -   XMP-xmp:Label

    -   Comment
        -   XMP-exif:UserComment

    -   Creator
        -   XMP-dc:Creator
        -   XMP-iptcExt:CreatorName

    -   Date, Time original resource was created
        -   XMP-dc:Date
        -   XMP-exif:DateTimeOriginal
        -   XMP-getty:OriginalCreateDateTime
        -   XMP-xmp:CreateDate

    -   Date, Time original resource was digitized (primarily used for scans)
        -   XMP-exif:DateTimeDigitized

    -   Date, Time resource was updated
        -   XMP-xmp:ModifyDate

    -   Date, Time I changed metadata
        -   (Cannot rely on XMP-xmp:MetadataDate; anyone might write it)
        -   custom tag?

    -   Description
        -   XMP-dc:Description
        -   XMP-xmp:Description

    -   File names of various sorts
        -   XMP-getty:CameraFileName
        -   XMP-getty:OriginalFileName

    -   Identifier of this image (unique to collection, not necessarily GUID)
        -   XMP-dc:Identifier
        -   XMP-xmp:Identifier
        -   (XMP-exif:ImageUniqueId cannot be relied upon, though maybe I
            could commandeer it for my own GUID)

    -   Identifier of parent image (refers to collection, not necessarily GUID)
        -   XMP-dc:Source
        -   custom tag?
        -   does anyone actually try to manage this?

    -   Keywords
        -   XMP-xmp:Keywords

    -   Place/Locality information
        -   important subtopic, left till a separate post

    -   Short Title
        -   IPTC:Headline
        -   XMP-xmp:Label
        -   XMP-xmp:Nickname
        -   meant to be optionally included in file name
        -   custom tag?

    -   Title
        -   XMP-dc:Title
        -   XMP-xmp:Title

[END OF LIST]

-   Sources for my originally suggested tags
    -   <https://exiftool.org/TagNames/XMP.html>
    -   <https://exiftool.org/forum/index.php/topic,6959.msg42911.html#msg42911>
    -   <http://www.exiv2.org/metadata.html>

BACKGROUND:

-   Ambiguity:  Which metadata tags are used for which piece of information does not seem always to be clearcut.  Is that right?
    -   This post illustrates the kind of ambiguity I mean:
        <https://exiftool.org/forum/index.php/topic,8521.0.html>

-   Some standards seem "more standard" and I wish to honor that:  I see at the Exiftool XMP Tags page:  "...using standard schemas such as dc, xmp, iptcCore and iptcExt is recommended if possible".
    -   I aim to do so but expect that some needs will lead outside that group.

-   Why am I asking?

    -   I want to semi-automate setting tags in JPEG image files for personal use.
    -   This seems akin to designing database tables in that care taken upfront will repay the effort later on.
    -   I prefer to modify files from the outset in a way that is trustworthy, consistent, and provides good coverage of the info I'm tracking.
    -   One goal is to be able to handle the vast majority of files just once and know that what is there is consistent w/prior, future files.
    -   A primary benefit is that I can use consistently tagged files to auto-construct file and folder names (from exif tags).

-   If I knew of GUI software without unwanted side effects that could automate batch writing of all these tags of interest, I might use it.  I could also write my own simple Perl or shell scripts utilizing Exiftool if no existing software fills the bill.

StarGeek

A good start would be to take a look at the standards.  The Metadata Working Group gives a good overview of various types and best ways to sync them.  You can find the PDF on their site.  The IPTC Photo Metadata Standard lists the most common metadata and the purposes for which they were designed.  The newest standard (January 2017) is web only and can be found here.  You can find a PDF of the slightly older version (October 2016) of the standard here, which IMO, is good enough for most uses.

The second thing I would recommend would be to figure out what tags your software uses.  Is it older and only reads IPTC tags?  Is it more modern and uses XMP?  Or is it like the Windows Properties and randomly displays one or the other group, not really following any standards.  Yeah, I have a bit of a bias regarding what Windows displays when you look at the property details of a file.

Just to provide an example, here's how I do things.  EXIF data, which is mostly technical details that a camera writes, stays as is.  I usually don't sync these with the corrisponding XMP (which is usually grouped under XMP-exif).  The exceptions are when the XMP data is more complete, which is the case in GPS data and Timestamps.  GPS data includes the hemesphere reference (E/W, N/S) and Timestamps can include a timezone.  In EXIF data, directional references and timezones are seperate tags and have to be combined to get the full data.  I also try to keep IPTC data and XMP data synced, only because the image viewer I use (Irfanview) doesn't display XMP data, only EXIF and IPTC.  The moment it will do so (not holding my breath) or I take the time to find a different viewer I like that does support XMP, I will ditch IPTC completely.  This is an example of what I mean by figuring out what tags your software uses.

Quote from: gponym on May 11, 2018, 08:32:28 AM
    Not knowing a better shorthand term, I use "exif tags" to refer to the various current standard tags and tag families used in JPEG files, including but not limited to XMP tags.

The more correct term would be Metadata.  All EXIF is Metadata, but not all Metadata is EXIF.  EXIF is commonly used to describe all Metadata, but when dealing with exact groups, it's better to be clear on the specifics.

QuoteI've really no idea which metadata tags are rarely or commonly used in the wild by cameras, other equipment and software out there.

Cameras rarely write anything other than EXIF data, though a few do write some XMP.  I don't think any write IPTC.  IPTC/XMP is usually added later by other programs, such as Digital Asset Management (DAM) programs.

Quote-   Album
        -   XMP:Album
        -   XMP-xmpDM:Album

I'm guessing you're using Elodie?  That's the only place I can find where Album is defined as only XMP:Album and even then Elodie now uses XMP-xmpDM:Album, only keeping XMP:Album defined for backwards compatibility (see here).

As a side note, in most cases with exiftool, you can just shorten any XMP based tags to just XMP and exiftool will write the most common tag automatically.  In the case of Album, writing to XMP:Album will automatically write XMP-xmpDM:Album.  You would need the config file that Elodie uses to write XMP:Album as a separate tag.

Quote-   Caption
        -   (superfluous?  duplicates function of Short Title or Title?)
        -   IPTC:Caption-Abstract
        -   IPTC:Headline
        -   XMP-xmp:Label

These are all different tags used for different purposes.  And this is where you need to check your software.  By the IPTC specifications I linked to above, Headline is supposed to be "A brief synopsis of the caption."  But Adobe software has decided to use Title for that purpose, whereas Title is supposed to be "A shorthand reference for the digital image. Title provides a short human readable name which can be a text and/or numeric reference."  This is often the name of the file if it's not a reference number.

Windows, on the other hand, treats Title and Caption-Abstract/Description the same, when they're definitely not the same.  See this post for a list of the definitions that Windows uses.

The best tag for your long caption would be XMP:Description.  Technically, XMP-dc:Description, but XMP:Description is good enough unless you really need XMP-xmp:Description for some reason.  XMP-dc:Description is what is used by the standard I linked above and by most software.  The older counterpart is IPTC:Caption-Abstract.

I'm not sure exactly what Label would be used for.  You can set it in Adobe products and I don't think it has a counterpart in other metadata groups.  I'm not sure what other programs would read this so I would put this tag in the Uncommon category.

Quote-   Comment
        -   XMP-exif:UserComment

I would suggest EXIF:UserComment instead, as that where it would be copied from.  I'm not sure any programs read this except Windows Properties and would label it as an Uncommon tag.

Quote-   Creator
        -   XMP-dc:Creator
        -   XMP-iptcExt:CreatorName

This one is tricky.  XMP-dc:Creator (or just XMP:Creator) would be the most common and you would use that if it's the only info you're inserting.  XMP-iptcExt:CreatorName, on the other hand, is part of a group of data, which is defined as a Structure.  If you look it up on the XMP page, it's part of the ContactInfo Struct, which includes CreatorCity, CreatorCountry, CreatorAddress, CreatorPostalCode, CreatorRegion, CreatorWorkEmail, CreatorWorkTelephone, and CreatorWorkURL.  Additionally, you can have more than one of these structures if you have multiple creators.  You can think of this as if each structure was a single entry in a database or a single row on a spreadsheet.  XMP-iptcExt:CreatorName is just exiftool's way of accessing all the CreatorName entries. 

How you want to handle this would depend upon the software you're using and how much info you are entering.  As structures are complicated, most software doesn't handle them but Adobe products do if that's what you're using.  If you just want the creator's name and are using older software, I'd suggest just sticking with XMP:Creator and consider CreatorName as uncommon.

Quote-   Date, Time original resource was created
        -   XMP-dc:Date
        -   XMP-exif:DateTimeOriginal
        -   XMP-getty:OriginalCreateDateTime
        -   XMP-xmp:CreateDate

Anything Getty related I would consider uncommon, unless you are making your images available as stock images on GettyImages and in that case I would assume that Getty has their own rules for handling metadata.

I would consider XMP-dc:Date as uncommon.

Quote-   Date, Time original resource was digitized (primarily used for scans)
        -   XMP-exif:DateTimeDigitized

    -   Date, Time resource was updated
        -   XMP-xmp:ModifyDate

This is the part where I always get mixed up, so I'm double checking.  According to the EXIF specs, the time the original resource was created would be EXIF:DateTimeOriginal, which would be matched to XMP:DateCreated (technically XMP-photoshop:DateCreated).  XMP-exif:DateTimeOriginal is redundant.  I'm not sure how common it is, the only thing that I've found that uses it is Flickr, which strangely gives it priority over EXIF:DateTimeOriginal and ignores XMP-photoshop:DateCreated.

XMP-exif:DateTimeDigitized is probably uncommon.  It maps to the EXIF tag of DateTimeDigitized, which exiftool calls EXIF:CreateDate.  But the more common XMP version would be XMP-xmp:CreateDate.  This would be the timestamp when the original resource was digitized.

Finally XMP-xmp:ModifyDate is the best for updated resource in XMP, which maps to EXIF:ModifyDate.

In all these cases, I believe that the EXIF tags would be the most common, followed by the XMP tags I mentioned that they map too (but not the XMP-exif group).

If you need timezones, then they can easily be added to the XMP tags.  The EXIF tags don't allow for timezones, but there was a update to the EXIF specs that added EXIF:OffsetTime (time zone for EXIF:ModifyDate), EXIF:OffsetTimeOriginal (time zone for EXIF:DateTimeOriginal), and EXIF:OffsetTimeDigitized (time zone for EXIF:CreateDate).  As these are still relatively new, I would classify them as uncommon.

Quote-   Date, Time I changed metadata
        -   (Cannot rely on XMP-xmp:MetadataDate; anyone might write it)
        -   custom tag?

If you want only the time you changed the date, then you can either repurpose some uncommon timestamp tag or create your own.  If you use Lightroom, XMP:MetadataDate is changed every time the metadata is changed, so as you say, it can't be relied upon.

Quote-   Description
        -   XMP-dc:Description
        -   XMP-xmp:Description

This would normally be your long description.  I'm not sure how you differentiate this from caption. This maps to EXIF:ImageDescription and IPTC:Caption-AbstractImageDescription is a case where most programs rarely use the EXIF tag over the others and personally, I tend to delete it as I feel it is redundant.

Quote-   File names of various sorts
        -   XMP-getty:CameraFileName
        -   XMP-getty:OriginalFileName

Another place where things are messy.  The getty tags are uncommon.  The IPTC spec would suggest XMP:Title/IPTC:ObjectName but as I mentioned, Adobe has decided to use these tags as the short description tag.  So basically anything you use here would be an uncommon tag and probably not used by most software.  An additional tag to look at would be XMP-xmpMM:PreservedFileName

Quote-   Identifier of this image (unique to collection, not necessarily GUID)
        -   XMP-dc:Identifier
        -   XMP-xmp:Identifier
        -   (XMP-exif:ImageUniqueId cannot be relied upon, though maybe I
            could commandeer it for my own GUID)

    -   Identifier of parent image (refers to collection, not necessarily GUID)
        -   XMP-dc:Source
        -   custom tag?
        -   does anyone actually try to manage this?

I would usually consider any XMP-dc to be more common than XMP-xmp, but these would be more uncommon in general.  You can find Source under Workflow in lightroom, but it uses XMP-photoshop:Source.  Also in Lightroom, there's an entry for Workflow->Job Identifier, but it maps to XMP:TransmissionReference.

Quote-   Keywords
        -   XMP-xmp:Keywords

The XMP-xmp:Keywords is a simple string tag which exiftool avoids writing to.  The most common place to write keywords to would be XMP:Subject, which is what most software uses.

Quote-   Place/Locality information
        -   important subtopic, left till a separate post

The most common would be XMP:Country, XMP:State, XMP:City, and XMP:Location.  And maybe XMP:CountryCode if you're so inclined.  These tags are considered Legacy by the most recent IPTC standard, but are well supported by most software.  But because the definitions of these tags don't directly state whether they're supposed to be the location the picture was taken or the location of the image content, there are new Structured tags to replace them.  I don't think these are well supported outside of Adobe products.  I won't go into details but you can find more info under LocationCreated and LocationShown on the IPTC Extension section[/tt] of the XMP tagnames page.  Each of those has 12 sub tags for them and you can have multiple LocationShown structures in an image.  Well, technically you can have multiple LocationCreated structures, but unless you're straddling a border of some sort, you don't need them.  They probably wouldn't be read by any software.  I tested multiple LocationCreated with Adobe LR and it ignored the extras.

Quote-   Short Title
        -   IPTC:Headline
        -   XMP-xmp:Label
        -   XMP-xmp:Nickname
        -   meant to be optionally included in file name
        -   custom tag?

Technically, XMP:Headline/IPTC:Headline is where this should be, but as I mentioned, Adobe has decided that XMP:Title is what  they're going to use.  So it's up to you and your software as what you want to use.

Quote-   Title
        -   XMP-dc:Title
        -   XMP-xmp:Title

See previous points on Headline and Filename.

Quote-   Ambiguity:  Which metadata tags are used for which piece of information does not seem always to be clearcut.  Is that right?
    -   This post illustrates the kind of ambiguity I mean:
        <https://exiftool.org/forum/index.php/topic,8521.0.html>
-   Some standards seem "more standard" and I wish to honor that:  I see at the Exiftool XMP Tags page:  "...using standard schemas such as dc, xmp, iptcCore and iptcExt is recommended if possible".
    -   I aim to do so but expect that some needs will lead outside that group.

Yeah, PDF tags are especially messy.  Avoid them if you can ;).   But Phil's advice on the XMP tags page is the best, though I would probably rank XMP-xmp below the others at the point in time, if not skipping them altogether.

Quote-   If I knew of GUI software without unwanted side effects that could automate batch writing of all these tags of interest, I might use it.  I could also write my own simple Perl or shell scripts utilizing Exiftool if no existing software fills the bill.

For the most part, Adobe products are probably the software with the least amount of side effects.  It's been awhile since I've checked so I'm not up to date, but the only problem I really found with them was that Lightroom 4.4 would write Description and Caption-Abstract slightly differently, using Line Feeds on one and Carriage Returns on the other.  The only other problem I have with them is that they write a lot of extra data to the file and I like keeping my metadata as simple as possible.  But that's just my obsession.  Adobe Bridge is free and is pretty good for writing a lot of detailed metadata, with more options than you would probably ever use.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Heh, I was bored, I like the topic, and it's something I've done a lot of thinking on.  It really needs a tl;dr, though
;D
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

gponym

Thanks for the pertinent and detailed response, StarGeek!  It's better than I could have hoped for.

To clarify a few points left muddy in the OP: 

a) I expect to write multiple tags in many cases on the theory that it will make my files more potentially decipherable to present and future software.  I want to stick to commonly used tags. 

However, if a tag is not commonly used but is so clearly "suited" for a purpose like the xmpMM:PreservedFileName tag you brought up, I plan to write it, too.  Either the world will see it that way, also, and change practice, or not. 

And if I find a tag to use for a datum that is seldom used by other software, it is more likely that my own info will remain readable (by me, if no one else).

I will leave myself open to the charge of "impurity" by routinely writing several tags for one datum, and using a few tags in ways that some agree with while others do not.  This is done in the interest of building redundancy so that I can preserve the information that is important to me, which leads to the next point....

b) The other reason to write my info into multiple tags is so that I can recover my information (from some tags) even after the file has been handled by software which uses some tags differently and overwrites my info there.  The redundancy is regrettable but I don't see a better-for-me alternative given that writing to the JPEG metadata "database" is pretty much a free-for-all for all comers. 

c) I prefer to keep my simple info needs met by confining metadata to the insides of the resource file; not willing to undertake keeping separate resource and metadata files in sync.  Though, come to think of it, keeping a separate file copy of my own metadata for reference sounds easy and worthwhile:  redundancy, quicker to riffle through--plus I can use grep, the tool that (almost) never disappoints!

d) If one is to keep the information usable for years to come, it must be visible to the important (present day and future) softwares that will interpret it.  Eventually I may need to make a pass through all files to bring metadata up to evolving practice in how certain tags are used. 

I expect this to happen rarely, and be fairly untroubling if my metadata is kept under control.  Like you, I like to stay focussed on the least possible set of data, no more but also no less than what is needful.

e) I don't run much software, and none of the major commercial/retail packages (like Adobe).  I don't really plan to run them, either, until and unless I give over a lot more time to making photographs, or one becomes compelling for the archiving I'm doing.  Just now, my focus is to label our many thousands of images so they are easily intelligible to strangers, easy to search and find, and easy to store for years to come.

I plan to respond in detail to your post by making a first pass at two things of highest value to my project:

1.  For each type of information ("type") I need, list commonly used metadata tags in priority order.

2.  For each type, write out an initial pass at logic that will encapsulate how to handle the possible tags associated with the type. 

This will include tags to favor, tags to avoid.  The forum response(s) here will be my primary guide about in-the-wild software behavior.  I will also consult the metadata references you provided.  I foresee a few tough calls.   :(

That said, Saturday is utterly full--I've overstayed here as it is--so I'll begin to work Sunday pulling stuff together.

craigl

Thanks for the detail on this topic.
I am scanning family photos and documents which is the main impulse for working with metadata. I am also looking to take care of a core set of meta-data since the tools I'm using make it difficult to set the values or retrieve them. Like gponym, I use IrfanView, but also Picasa (for tagging people), Luminar 4 for tweaking photos, XnViewMP (just experimenting with what metadata can be written and read there, but it looks pretty impressive), exiftool and GooglePhotos. I also use OneDrive as my home for everything and I read lately that in their web interface you can manipulate some metadata, but have not experimented with it. The final destination for most of the tiffs, jpegs and pdfs is FamilySearch. It supports face detection and tagging. I have not experimented with what people data I can get if I download images from there but I am almost certain that for privacy issues there will be none.

My big requirements are:


  • inserting a long description to hold file identifier, short caption, lengthy description, event date, event place, and contributor
  • leaving contact information in case someone else doing family history can get in touch with me
  • keeping track of at least the names of the people in the photo
  • keeping a few keywords so I can keep track of what state things are in as the progress through my workflow
  • a set of categories/keywords/albums/collections (I am not sure what to call it, and it does not appear the various standards share data definitions, anyway.) to group my items by albums, family, event and location.


Please confirm if I am on the right track or you have a better solution.

For contributor, use XMP:Creator.

For the long description, use XMP:description.

For contact information, use ExifTool XMP-iptcCore family 1 group.

For keywords, use XMP:Keywords.

For collections/groups, use XMP:Keywords or XMP:Album, but XMP:Keywords is preferred.

For tracking people, use XMP:RegionName and XMP:RegionType.

StarGeek

I'd suggest using a Digital Assets Management program such as Digikam, DarkTable (both free), or Lightroom (paid).  Or at the very least use a more complete image browser such as Adobe Bridge (free).

Metadata is very complex and any of the above programs will make it much easier to deal with, taking care of a lot of things behind the scenes.  Writing your data with the above programs and then checking with exiftool (see FAQ #3) is a good way to learn if you want to figure out the internals.

The best place to see what data is supposed to go where is the IPTC Photo Metadata Standard.  It's a dry read as expected from a standard, but the main things you would read would be the "Definitions" and "Help Text", maybe the "User notes".  Then you could look at the "XMP Specs" line to get the tag name. 

But it's your data and your images.  If you decide you want to put something in a different place, the go for it.  Unless you are selling your images on GettyImages, it doesn't matter.  My number one piece of advice is to get the data into the file somewhere.  You can always move it to a more appropriate place with exiftool later.

QuoteFor contributor, use XMP:Creator.

Technically XMP:Creator is for the "name of the photographer".  There is the Dublin Core XMP:Contributor tag (yes, yet another standard) which might be what you're looking for.

QuoteFor the long description, use XMP:description.

Yes.

QuoteFor contact information, use ExifTool XMP-iptcCore family 1 group.

Yes, as long as you're using the flattened tags, the ones with an underscore in the "Writable" column (see XMP iptcCore Tags).  This is actually a single Structured tag that is very complex, but exiftool gives you an easy way to access the individual parts with the flattened tags.

QuoteFor keywords, use XMP:Keywords.

No.  You want to use XMP:Subject for images (see IPTC Standard - Keywords).  If you look on the XMP tags page, you'll see a couple of options for XMP:Keywords, but the default is to write to XMP-pdf:Keywords, which is a string and not a list type tag and is obviously PDF related.  There's also XMP-xmp:Keywords, which is non-standard, and XMP-acdsee:Keywords, used by ACDSee DAM.  The last two are marked as "Avoid" in the writable column and will not be created unless explicitly declared in a command.

Quotea set of categories/keywords/albums/collections (I am not sure what to call it, and it does not appear the various standards share data definitions, anyway.) to group my items by albums, family, event and location.
...
For collections/groups, use XMP:Keywords or XMP:Album, but XMP:Keywords is preferred.

See previous.  For XMP:Album, exiftool will write to the XMP-xmpDM:Album.  That appears to be more music/sound related (see Adobe's Partner's Guide To XMP For Dynamic Media PDF if you really want details).

A DAM that deals with hierarchical keywords is good for this.  There is the XMP:HierarchicalSubject tag with Lightroom uses.  Each level of the hierarchy is separated by a pipe character |.  While this image doesn't technically use XMP:HierarchicalSubject, it's using other XMP location data, the result would be the same.

Put this info into a XMP:HierarchicalSubject you would have something like
California|San Diego|Balboa Park

QuoteFor tracking people, use XMP:RegionName and XMP:RegionType.

For just people's names, you would want to use XMP:PersonInImage.  Picasa writes to the MWG region tags, which is also what is used by Lightroom, so it has much wider support than the other types of region tags.  The region structure is very complicated and best left to a program with facial recognition to deal with.  But one thing to watch with Picasa is that in some cases, it will delete the camera MakerNotes tags. It definitely does for Nikon MakerNotes.  It's up to you if you want to keep those.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).