ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: simonmcnair on June 26, 2023, 09:49:17 AM

Title: leading spaces
Post by: simonmcnair on June 26, 2023, 09:49:17 AM
I'm trying to automate tag addition using ML for Digikam, but I'm having a nightmare.  No matter what I do I end up with a leading space or two.
I've tried, in python,

ret = subprocess.check_output(['exiftool','-P','-overwrite_original', '-api', '"Filter=s/^ +//"','-TagsFromFile','@','-subject','-XMP:subject','-IPTC:Keywords','-XMP:CatalogSets','-XMP:TagsList',img_path])
but I still end up with some tags having ' this is a bed' vs 'this is a bed'

I would appreciate any help.  I tried using -sep ", " and -sep "," but tbh it doesn't seem to do what I thought it would, where it would alter the separator in the output.  #soconfused.
Title: Re: leading spaces
Post by: Phil Harvey on June 26, 2023, 01:04:58 PM
Can you provide a console log that allows us to reproduce the problem?  I get this:

> exiftool a.jpg -subject=" test1" -subject=" test2"
    1 image files updated
> exiftool b.jpg -tagsfromfile a.jpg -subject -api "Filter=s/^ +//"
    1 image files updated
> exiftool a.jpg b.jpg -subject
======== a.jpg
Subject                         :  test1,  test2
======== b.jpg
Subject                         : test1, test2
    2 image files read

(I'm copying to a different file above so I can run multiple tests.)

- Phil
Title: Re: leading spaces
Post by: StarGeek on June 26, 2023, 01:58:38 PM
What are you using to view the tags?  By default, exiftool will separate the tags with a comma(space) on the command line, even though they are completely separate.  Other programs often do something similar.

If you use exiftool to look at the raw XMP with
exifool -b -xmp /path/to/files/

You can see what the exact values are
  <dc:subject>
  <rdf:Bag>
    <rdf:li>tag 1</rdf:li>
    <rdf:li>tag 2</rdf:li>
  </rdf:Bag>
  </dc:subject>
Title: Re: leading spaces
Post by: simonmcnair on June 26, 2023, 02:36:30 PM
Thanks for the replies.  I will try and diagnose further.  Cheers
Title: Re: leading spaces
Post by: simonmcnair on June 27, 2023, 12:06:17 PM
I don't know how best to explain, I'm using digicam to diagnose the issue, then I tried using exiftool to get rid of the leading spaces, but despite running it multiple times, the leading space is not getting removed.

This is made twice as hard by the fact that I cannot get exiftool to output the comma separated values without it adding spaces.
I run the command
existing_tags = subprocess.check_output(['exiftool', '-XMP:Subject', '-IPTC:Keywords', '-XMP:CatalogSets', '-XMP:TagsList', img_path]).decode().strip()
and get the output

Subject                        : tag1,  tag2,  tag3,  tag4, 
Keywords                        : tag1,  tag2,  tag3,  tag4, 
Catalog Sets                    : tag1,  tag2,  tag3,  tag4, 
Tags List                      : tag1,  tag2,  tag3,  tag4, 


but the output is incorrect as I don't know how many spaces are before 'tag', and I can't massage the output without refactoring the code.

I tried working around it by just executing the remove spaces regex but the leading space(s) are not getting removed.

The somewhat messy and uncommented and poorly written python is here if you want to see

https://github.com/simonmcnair/SDTagging/blob/main/tag.py
Title: Re: leading spaces
Post by: StarGeek on June 27, 2023, 03:39:47 PM
Is the output the same when you do it on the command line?

Can you share an example image?

Your output places a comma at the end of each line, which exiftool wouldn't do unless the comma was part of the tag or there is an empty tag at the end.

Your code seems to indicate that you will be running exiftool once per image (see Common Mistake #3 (https://exiftool.org/mistakes.html#M3)).  You might instead look to using PyExifTool (https://github.com/sylikc/pyexiftool), which is a wrapper that keeps exiftool running in the background using the -stay_open option (https://exiftool.org/exiftool_pod.html#stay_open-FLAG) and will improve processing time.
Title: Re: leading spaces
Post by: simonmcnair on June 27, 2023, 04:46:39 PM
Quote from: StarGeek on June 27, 2023, 03:39:47 PMIs the output the same when you do it on the command line?

Can you share an example image?

Your output places a comma at the end of each line, which exiftool wouldn't do unless the comma was part of the tag or there is an empty tag at the end.

Your code seems to indicate that you will be running exiftool once per image (see Common Mistake #3 (https://exiftool.org/mistakes.html#M3)).  You might instead look to using PyExifTool (https://github.com/sylikc/pyexiftool), which is a wrapper that keeps exiftool running in the background using the -stay_open option (https://exiftool.org/exiftool_pod.html#stay_open-FLAG) and will improve processing time.

I'll have a go at refactoring it, I did try pyexiftool in previous iterations of it, and I can't honestly remember why I went back to doing it the old way.

Hopefully that will show me the tags more clearly, and I can work out the issue.  Cheers
Title: Re: leading spaces
Post by: simonmcnair on July 04, 2023, 09:17:54 AM
So the filter does work beautifully, it's just parsing the data in Python that caused me heartache.  I had faith in exiftool, but it can be hard to work with spaces when I couldn't get the data out of exiftool clearly.

I've been refactoring my code to use pyexiftool, my only issue appears to be that I can't incrementally build a command line, like I did before unless I use the exiftool function rather than exiftoolhelper.

By the time I use exiftool in python it is virtually the same as running it from subprocess , perhaps, maybe, possibly.