-json output and multi-language data

Started by Mac2, April 11, 2022, 11:23:33 AM

Previous topic - Next topic

Mac2

I'm parsing JSON data produced via:

exiftool -G -xmp:all -JSON -D -t -q -q <Filename>

Works great, except for multi-language XMP data. For example, I get:

"XMP:Description-de": {
  "id": "description-de",
  "table": "XMP::dc",
  "val": "Deutscher Text"
},
"XMP:Description-en": {
  "id": "description-en",
  "table": "XMP::dc",
  "val": "English Text"
}


Before, I used XML for the data transfer. The XML output included an xml:lang attribute for multi-language data.
I used that to detect such values and then normalized the tag id (I need to lookup the tag in a database) by removing the language appendix:

xml:langID="en": tagId="description-en" => tagId="description"
xml:langID="fr": tagId="description-fr" => tagId="description"

For the JSON output, am I correct assuming that when my code finds a tag id containing a - (minus), that

a) the part after the - is the language id and
b) that I can find the actual tag id by stripping the - and what follows?

But I also see tag ids like ...-offset or ...-2 in the output. So, how to tell what is a language-specific value and what is not?

Phil Harvey

#1
Good question.  The hyphen is a valid character in a normal tag name.  However, I don't think you'll ever see "-XX" at the end of a normal tag name, where X is a-z.

- Phil

Edit: I grepped the -list output, and there is a FlashStatusBuilt-in tag in the MakerNotes, but no such tags in XMP, and I don't think that any MakerNotes tags have alternate languages.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

I'll implement something that checks for -cc where the length is 3 and both characters are in the [a-z] range. A regexp should do.
This rules out -offset (length > 3) and -2 (not a character).

Maybe, in a future ExifTool version, add a langId="" element to the JSON output (for multi-lang tags). Maybe linked to the -l?
This would be more reliable, in-sync with the XMP output and should not break existing code which parses the JSON.

Phil Harvey

Good suggestion.  I'll add a "lang" element in ExifTool 12.42, and link this to -D, -H and -t as it is for the -X option.

> exiftool a.jpg -xmp:description-en -j -G -t
[{
  "SourceFile": "a.jpg",
  "XMP:Description-en": {
    "id": "description-en",
    "lang": "en",
    "table": "XMP::dc",
    "val": "English Text"
  }
}]


- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mac2

Excellent. I'll then update my implementation (which now works correctly) to use the lang.
Thanks for your awesome support!  :)

Moonbase59

I don't know if there are any standards for this, but it looks like making all that crazy tagging even one level more complicated ... sigh.

I lately ran into an iPhone4 Quicktime .mov file that had these crazy settings:

exiftool -G1 -a -s 20110429-212714.000.mov > iphone4.txt

[UserData]      Model-deu                       : iPhone 4
[UserData]      SoftwareVersion-deu             : 4.3.1
[UserData]      ContentCreateDate-deu           : 2011:04:29 20:24:08+02:00
[UserData]      GPSCoordinates-deu              : 52 deg 46' 55.56" N, 7 deg 6' 40.32" E, 26.88 m Above Sea Level
[UserData]      Make-deu                        : Apple
[UserData]      Model                           : iPhone 4
[UserData]      SoftwareVersion                 : 4.3.1
[UserData]      ContentCreateDate               : 2011:04:29 20:24:08+02:00
[UserData]      GPSCoordinates                  : 52 deg 46' 55.56" N, 7 deg 6' 40.32" E, 26.88 m Above Sea Level
[UserData]      Make                            : Apple
[Keys]          Make-deu-DE                     : Apple
[Keys]          CreationDate-deu-DE             : 2011:04:29 20:24:08+02:00
[Keys]          GPSCoordinates-deu-DE           : 52 deg 46' 55.56" N, 7 deg 6' 40.32" E, 26.88 m Above Sea Level
[Keys]          Software-deu-DE                 : 4.3.1
[Keys]          Model-deu-DE                    : iPhone 4
[Keys]          CreationDate                    : 2011:04:29 20:24:08+02:00
[Keys]          Software                        : 4.3.1


Don't know if these were added by using some odd piece of software on it, but the friend I got it from swears "no". I wonder. CreateDate, Make, Model and GPS in "deu" and even "deu-DE"?! Stupid.

Mac2

Another case of what I refer to as metadata mess.

German GPS coordinates. Sigh.
And some in "deu" and some even, more precise, in "deu-DE".
Probably not an ISO format, at least none I know of. "de" or "de-DE" would be ISO.

This is probably one of those cases where you only recognize how random and non-standard-compliant the metadata in your files is when you leave the walled garden.
Or Apple drops support for crappy metadata produced by their devices and tools years ago...

From my experience, metadata is video files is even messier than metadata in image files. And that's a tall order.

Mammut

Hello Phil,

do you plan to add lang element (like in Description) to structs, too?

In example:

    "ArtworkOrObject": {
      "id": "ArtworkOrObject",
      "table": "XMP::iptcExt",
      "val": [{
        "AOCircaDateCreated": "AO Circa Date: between 1550 and 1600 (ref2021.1)",
        "AOContentDescription": "AO Content Description 1 (ref2021.1)",
        "AOContentDescription-en": "AO Content Description 2 (ref2021.1)",
        "AOContentDescription-fr": "AO Content Description 3 (ref2021.1)",



I don't have a problem with this, it's not that hard to search partial keys, but it would be good to know if there will be any changes later. :)

StarGeek

Quote from: Mammut on July 09, 2022, 07:39:29 AM
do you plan to add lang element (like in Description) to structs, too?

It's already there.  As for you rexample, look at the XMP iptcExt Tags listing.  Anything that has lang-alt in the writeable column is a lang-alt tag.
C:\ >exiftool -echo "Flattened tag output" -G1 -a -s -Art* y:\!temp\Test4.jpg
Flattened tag output
[XMP-iptcExt]   ArtworkCircaDateCreated         : AO Circa Date: between 1550 and 1600 (ref2021.1)
[XMP-iptcExt]   ArtworkContentDescription       : AO Content Description 1 (ref2021.1)
[XMP-iptcExt]   ArtworkContentDescription-en    : AO Content Description 2 (ref2021.1)
[XMP-iptcExt]   ArtworkContentDescription-frl   : AO Content Description 3 (ref2021.1)

C:\ >exiftool -echo "Structured tag output" -G1 -a -s -struct -Art* y:\!temp\Test4.jpg
Structured tag output
[XMP-iptcExt]   ArtworkOrObject                 : [{AOCircaDateCreated=AO Circa Date: between 1550 and 1600 (ref2021.1),AOContentDescription=AO Content Description 1 (ref2021.1),AOContentDescription-en=AO Content Description 2 (ref2021.1),AOContentDescription-frl=AO Content Description 3 (ref2021.1)}]

C:\ >exiftool -echo "Structured json tag output" -G1 -a -s -j -struct -Art* y:\!temp\Test4.jpg
Structured json tag output
[{
  "SourceFile": "y:/!temp/Test4.jpg",
  "XMP-iptcExt:ArtworkOrObject": [{
    "AOCircaDateCreated": "AO Circa Date: between 1550 and 1600 (ref2021.1)",
    "AOContentDescription": "AO Content Description 1 (ref2021.1)",
    "AOContentDescription-en": "AO Content Description 2 (ref2021.1)",
    "AOContentDescription-frl": "AO Content Description 3 (ref2021.1)"
  }]
}]
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Mammut

#9
Sorry, I meant that exiftool doesn't create this sub-hierarchy in the json file like in ie. Description, it only adds the language (-en) to the key in the inner structs.

 
"XMP:Description-en": {
    "id": "description-en",
    "lang": "en",
    "table": "XMP::dc",
    "val": "English Text"


vs.


        "AOContentDescription-en": "AO Content Description 2 (ref2021.1)",
        "AOContentDescription-fr": "AO Content Description 3 (ref2021.1)",


But it's not a problem, it's totally understandable this way. Just it would be good to know if there is any plan to change this behavior (because than I need to check if it's a text value or a dictionary value with "val" key).

Phil Harvey

There is no plan to change this behaviour.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Mammut