Parsing JSON List Tags

Started by Christian Etter, April 25, 2010, 09:08:13 AM

Previous topic - Next topic

Christian Etter

The Json output option works great :-)

Yet there is one thing I stumbled upon when dealing with list type tags: If I define a Json contract to parse input data, certain attributes such as Keywords may contain multiple values (list) and should therefore be defined as an array type. Parsing works fine unless there is only a single value stored - in that case ExifTool will return it formatted as a simple string value instead of an array (with one element). That is enough to make my deserializer give up parsing with an error.  Not sure if this is a shortcoming of the deserializer or a violation of the Json spec though.

Right now my workaround is using the -sep option and then manually parse the resulting string into an array. It's not a 100% clean solution since the separator character may also exist as part of a keyword. My idea was to use a -sep $/ line break or -E -sep #xa; - yet they are copied literally to the output.

Any thoughts on how this can be improved?

Christian

Phil Harvey

Hi Christian,

There is very little difference between a one-element list and a single-valued tag in most information formats.  XMP is the exception, and one-element lists are extracted in XMP if the -struct option is used.

This can't be a violation of the JSON spec because JSON doesn't define the variable types.  It should be possible to design a parser that accepts either a list or a single value (or a structure if the -struct option is used).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Christian Etter

Hi Phil,
Background: I am working with .NET at the moment which provides an out-of-the-box JSON deserializer called DataContractJsonSerializer. Unfortunately it cannot be customized with regard to parsing single elements as a string OR (optionally) as an array of strings. It is however possible to add a post-deserialization step which I use to split a string into an array of strings. It works well as long as the separator string does not occur in the list itself.

If I remember correctly, the newline \n character serves as a list separator internally, so it would be good to use it as a collision-free separator. Is there a way of getting -sep $/ or -E -sep #xa; to work?

Thanks
Christian

Phil Harvey

Hi Christian,

The -sep option may be used for any separator which can be generated on the command line.  If you are calling from .NET it is possible you can insert a newline directly.  All shells I know except for the Windows cmd shell allow newlines to be inserted in the command line.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).