clean up IPTC Description/Caption field

Started by frereroy, May 18, 2018, 09:14:33 AM

Previous topic - Next topic

frereroy

We have a number of images that were annotated using Adobe Bridge and the person was a little careless in adding line feeds and/or carriage returns to the end of the Description/Caption field - usually after copy/pasting from a Word document.

Is there any way to clean up this field?

TIA

Phil Harvey

This command will remove trailing white space from the value for a TAG:

exiftool "-TAG<${TAG;s/\s+$// or $_=undef}" DIR

Here I have set the value to "undef" in the expression if there was no white space to remove.  This will give a warning, but will prevent a file from being rewritten if the tag wasn't changed.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

Thanks Phil, but is not trailing white spaces that I am trying to kill but rather trailing linebreak characters.

StarGeek

White space characters are Line Feeds and Carriage Returns, as well as Spaces and Tabs.

Do need to only just remove the Line Feeds and Carriage Returns and keep trailing spaces?  Otherwise Phil's command above will work.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

frereroy

Thanks, Yes I am using the following command on a Mac:

exiftool '-Description<${Description;s/\s+$// or $_=undef}' /Users/roy/Desktop/phil

Yes, the trailing spaces and linefeeds are indeed removed from the XMP but I would like them also removed from the embedded IPTC.


StarGeek

Add this to the command:
'-Caption-Abstract<${Caption-Abstract;s/\s+$// or $_=undef}'
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

frereroy


frereroy

Please can you confirm that the following single command will update both IPTC and XMP fields.

It seems to work, but just would like to double check before launching it on 8.000 photos.

exiftool -use MWG '-Description<${Description;s/\s+$// or $_=undef}' -overwrite_original -r /Users/roy/Desktop/phil

TIA

Phil Harvey

?

Did you mean this command?:

exiftool -use MWG '-Description<${Description;s/\s+$// or $_=undef}' '-Caption-Abstract<${Caption-Abstract;s/\s+$// or $_=undef}' -overwrite_original -r /Users/roy/Desktop/phil

If so, then the answer is yes.  Otherwise, you were only writing the XMP:Description tag.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

Thanks, works just fine. I was afraid that I may be overlooking something.

BTW, Information on the MWG tags is here:

https://exiftool.org/TagNames/MWG.html

(I have come across broken links in other threads).

Phil Harvey

Quote from: frereroy on May 19, 2018, 09:28:31 AM
(I have come across broken links in other threads).

If you tell me the threads I will fix the links.

Note that this is the official URL for this page:

https://exiftool.org/TagNames/MWG.html

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

The error is in the first message of this thread (misspelt .htmlt)
https://exiftool.org/forum/index.php/topic,9205.0.html

BTW, to only treat jpg files can I use:

exiftool -use MWG '-Description<${Description;s/\s+$// or $_=undef}' '-Caption-Abstract<${Caption-Abstract;s/\s+$// or $_=undef}' *.jpg -overwrite_original -r /Users/roy/Desktop/phil

TIA

StarGeek

* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

frereroy

OK, I got it. (-ext JPG)

For Windows:

exiftool -ext JPG -use MWG "-Description<${Description;s/\s+$// or $_=undef}" "-Caption-Abstract<${Caption-Abstract;s/\s+$// or $_=undef}" -overwrite_original -r /Users/roy/Desktop/Photothèque

frereroy

Is there a switch that will allow parsing accented directory names in Windows 7 West european?

TIA

Phil Harvey

If you are passing special characters in file names on the command line, you must set the -charset filename=XXX option to set the character set to whatever you are using.  See here for more information.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

Thanks. Am using the -L switch (for Windows Latin1).

As the job progresses I see "funny" characters in the paths that contain accents but the filenames themselves, which do not contain accented characters, are treated correctly.

Phil Harvey

-L is for tag values.  The -charset filename=XXX option is for file/directory names.  There is some overlap because the FileName and Directory tags use both options.

Is the job working aside from seeing funny characters?  An where are you seeing them?

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

I am now using the following command

Quoteexiftool -ext JPG  -charset filename=Latin -use MWG "-Description<${Description;s/\s+$// or $_=undef}" "-Caption-Abstract<${Caption-Abstract;s/\s+$// or $_=undef}" -overwrite_original -r /Phototheque

The job is working well but the accented directory names show as gobbledy-gook as the lines scroll through the command line box.

Phil Harvey

If you are talking about the file name after the "====" when processing multiple files, then yes.  There is currently no character translation for the informational messages.  I should maybe look into doing this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

#20
I do not see the "===="

The funny thing is that if I write ".... -r /Phototheque" then all embedded directories with accents in their names will be processed albeit showing the odd characters on the screen while processing but if the top level directory contains an accent like ".... -r /Photothèque" then I get the error  "File not found". and the underlying directories are not parsed.


Phil Harvey

That's not funny at all.  Specifying file names with special characters is problematic as I said.  It can be done, but you must get your character sets correct.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

frereroy

exiftool -ext JPG -use MWG "-Description<${Description;s/\s+$// or $_=undef}" "-Caption-Abstract<${Caption-Abstract;s/\s+$// or $_=undef}" -overwrite_original -common_args -charset filename=cp1252 -r /Photothèque


Now works. i.e. the top level directory is read correctly with it's accent and parsing the directories within works inspite of showing "odd" characters on the screen instead of the accented ones.

I was rather hoping that the -common_args tag that I added would correct this.