Questions to DarwinCore tags

Started by herb, April 20, 2018, 07:28:57 AM

Previous topic - Next topic

herb

Hello Phil,

working with XMP DarwinCore-Tags I looked (of course) into your internet page https://exiftool.org/TagNames/DarwinCore.html
But I also looked at the corresponding pages of Exiv2 - http://www.exiv2.org/tags-xmp-dwc.html - and also
the TDWG internet page http://rs.tdwg.org/dwc/terms/index.htm

I hoped/expected to find identical "structures" and also more or less identical or similar tagnames.
But I was surprised to find some differences.

Exiftool has defined 9 "structures" (e.g. "DarwinCore DCTermsLocation Struct" or " DarwinCore Event Struct") and
TDWG has 16 "main structures" or "Record-level Terms" (see headlines in chapter 1. TermIndex).

Common structures in Exiftool and also TDWG are:
Quote- Record (with 11 tags)
- Occurrence (21 / 23)
- Event (16 / 18)
- dctermsLocation (44)
- GeologicalContext (18)
- Identification (8 )
- Taxon (33)
- ResourceRelationship (7)
- MeasurementOrFact (9)

TDWG lists also the following structures, which are unknown by Exiftool:
QuoteOrganism, MaterialSample, LivingSpecimen, PreservedSpecimen, FossilSpecimen, HumanObservation, MachineObservation
(They are also mentioned by Exiv2.)

The structures common to Exiftool and TDWG have an identical amount of tags except Occurrence and Events.
For structures with identical amount of tags I did not check all the names of listed tags.

Occurrence:
QuoteOccurrence has 19 tags in common, but I found also found 2 + 4 tags known only by one of them:
TDWG:   OrganismQuantity, OrganismQuantityType
Exiftool: AssociatedOccurrences, IdividualID, OccurrenceDetails, PreviousIdentifications

The Exiftool tags - AssociatedOccurrences and PreviousIdentifications - are listed in TDWG structure Organism (which has in total 7 tags)

Events:
QuoteEvents has 14 tags in common, but I also found 4 + 2 tags known only by one of them:
TDWG: ParentEventID, EventDate, SampleSizeValue, SampleSizeUnit
Exiftool: EarliestDate, LatestDate

Here I can imagine that Exiftool uses "EarliestDate" and "LatestDate" instead of "EventDate" because EventDate may be specified as a period.

Looking into TDWG history: http://rs.tdwg.org/dwc/terms/history/decisions/index.htm I have seen that many of the TDWG "structures" and also tags have been introduced after Oct. 9th 2013; the latest change to DarwinCore tags in Exiftool was with version 9.48 on Jan. 25th 2014.

I hope that I read the documents carefully enough.
I guess that Exiftool should adopt the latest version of TDWG.

Thanks for any comments in advance

Best regards
Herb

Phil Harvey

Thanks Herb,

I'll look into updating the DarwinCore tags when I get a chance.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello Phil,

thank you very much, that you will look into this.

I hope not to annoy you, but I have seen some additional differences to Exiv2.

Exiv2 has marked the following tags as "XmpBag" which for me is identical to Exiftool listtype tags:
QuotedynamicProperties
recordedBy
preparations
otherCatalogNumbers
previousIdentifications in Organism and Occurrence
associatedMedia
associatedReferences
associatedOccurrences Organism and Occurrence
associatedSequences
associatedTaxa
associatedOrganism
higherGeography
georeferencedBy
georeferenceSources
identifiedBy
identificationReferences
typeStatus
Taxon
higherClassification

As Taxon - the tag that defines a complete structure - is marked as list I have an aditional question.
Is it possible to have a structure as list element?

As far as I know Exiftool this is not possible when flattened tags are used.
Because I have no experience/knowledge about -struct option, my question: is it possible there?

Thanks in advance.

Best regards
Herb

Phil Harvey

Quote from: herb on April 23, 2018, 10:13:17 AM
Is it possible to have a structure as list element?

Yes.

QuoteAs far as I know Exiftool this is not possible when flattened tags are used.

Yes it is, but you are not guaranteed to get the proper associations unless you use structured tags.  It will fail in the case that not all structures have the same elements.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello Phil,

thanks for your answers. I am astonished again, what all can be done with Exiftool.

Based on your answers I was looking for such tags and structures to test.
I found "NewXMPxxxStruct" in your Example_config file, which I used for all my tests described here.

For listtype tag "Things" I could use -struct option but also flattened tagnames in order to create "structured" list elements.

But I also created a test.xmp file
exiftool.exe -NewXMPxxxStruct=[{Things=[{What=w1,Where=w2}]},{Things=[{What=w8,Where=w9}]}] test.xmp
and I was not able to create this structure of 2 "Things" list elements using flattened tagnames.
Is it really also possible?

Best regards
Herb

Phil Harvey

Hello Herb,

Thanks again for your suggestion.

Here are my comments:

1. I will add the MaterialSample and Organism structures.

2. The LivingSpecimen, PreservedSpecimen, FossilSpecimen, HumanObservation and MachineObservation structures have no elements defined in the DarwinCore specification, so I don't know how I can add these.

3. I will add eventDate, parentEventID, sampleSizeValue and sampleSizeUnit to the Event structure.

4. I will add organismQuantity and organismQuantity to the Occurrence structure.

5. I can find nothing in the DarwinCore documentation stating that the "XmpBag" tags you mention are stored as an XMP Bag.  They are lists, certainly, but I only see mention that they are concatenated lists.  If you can find mention (or an authoritative example) of them being stored as an XMP bag, please let me know.

And to answer your question:  It is possible to have a list of structures, and you can write this using either flattened or structured tags, but writing using flattened tags has limitations.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Regarding the XmpBag tags.  Looking at the DwC-SampleImage example posted in this thread, none of these tags are XMP Bag type.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

herb

Hello Phil,

great that you will enhance the Exiftool DarwinCore tags.

Quote2. The LivingSpecimen, PreservedSpecimen, FossilSpecimen, HumanObservation and MachineObservation structures have no elements defined in the DarwinCore specification, so I don't know how I can add these.
My interpretation of DarwinCore page http://rs.tdwg.org/dwc/terms/is:
MaterialSample | LivingSpecimen | PreservedSpecimen | FossilSpecimen are 4 structures that have all 1 tag with identical tagname "materialSampleID".
Same is valid for: Event | HumanObservation | MachineObservation.

Quote5. I can find nothing in the DarwinCore documentation stating that the "XmpBag" tags you mention are stored as an XMP Bag.  They are lists, certainly, but I only see mention that they are concatenated lists.  If you can find mention (or an authoritative example) of them being stored as an XMP bag, please let me know.
On one hand you are right:
These tags are defined as lists in the "DarwinCore XMP Schema Configuration.doc" document attached to the thread you mentioned.
All examples show a string and only man can decide that it is a list.
But why has Exiv2 defined them as "XmpBag"?
I thought, that IDImager was unable to write bags at that time DarwinCore tags had been introduced to Exiftool.

Thanks again and
Best regards
Herb

Phil Harvey

Hi Herb,

Quote from: herb on May 04, 2018, 06:51:31 AM
Quote2. The LivingSpecimen, PreservedSpecimen, FossilSpecimen, HumanObservation and MachineObservation structures have no elements defined in the DarwinCore specification, so I don't know how I can add these.
My interpretation of DarwinCore page http://rs.tdwg.org/dwc/terms/is:
MaterialSample | LivingSpecimen | PreservedSpecimen | FossilSpecimen are 4 structures that have all 1 tag with identical tagname "materialSampleID".
Same is valid for: Event | HumanObservation | MachineObservation.

Ah, yes.  Thanks.  I'll add these.

QuoteAll examples show a string and only man can decide that it is a list.
But why has Exiv2 defined them as "XmpBag"?

I don't know.  They were working from the same XMP sample that I was (DwC-SampleImage.jpg), which didn't use the XMP Bag.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).