ExifTool Forum

ExifTool => Newbies => Topic started by: mazeckenrode on June 06, 2020, 10:45:37 PM

Title: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 06, 2020, 10:45:37 PM
I deal with a fair amount of scanned multi-page documents and publications, or portions thereof. Unless and until I come up with a better way of creating metadata for them that is relevant and useful for my purposes, or someone suggests one to me, I've developed my own standard (though apologies if anyone else started using this same workflow before me):

Let's say I'm adding metadata to individual page images of a 10-page document (though most of these documents have been fewer pages, but a handful have been 20+ pages), which was completed being prepared by 5 PM on May 31st. The image files will have names such as "2020-05-31 17;00;00 - Document - 01.png", "2020-05-31 17;00;00 - Document - 02.png", etc.

1. I set appropriate EXIF/IPTC/XMP metadata fields for any and all Comment, Description, and Subject fields in all files to a suitable, common ("master") text string that includes the substring "p 1/10". I then (up until now) manually modify those fields for pages 2–10 to reflect what page they actually are ("p 2/10", "p 3/10", etc.).

2. Along with other appropriate entries, I set any and all Keywords fields to a master semi-colon-separated string which includes the substring "Page1", then manually modify that substring for pages 2–10.

3. I set EXIF/XMP:DateTimeOriginal and IPTC:DateCreated in all files to "2020:05:31 17:00:00", then manually bump the seconds down successively in pages 1–9, so that their times are 16:59:51 for page 1, through 16:59:59 for page 9.

I'm sure there must be a way for me to do some fancy scripting, probably using regular expressions, have EXIFTool extract the correct page number from each filename, find/replace each "p 1/10" with "p #/10" (without leading 0) in the Comment/Description/Subject fields of pages 2–10, find/replace each "Page1" with "Page#" (also without leading 0) in the Keywords fields, and programmatically bump the seconds down as necessary for pages 1–9 using some kind of loop construct.

I'm currently using the stand-alone Windows executable of EXIFTool v11.95. I have some familiarity with regular expressions, mostly as it's used in Notepad++ and Directory Opus, a little bit with Python (via the Python Script plugin for Notepad++), but none with EXIFTool beyond having visited a few forum threads which deal with it to various degrees. I haven't managed to extract enough useful information from the threads to put me on solid footing for getting started. Any suggestions or pointers to better and/or more specific resources?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 08, 2020, 12:11:29 PM
Ok, so I'm trying to make sense of various examples of ExifTool usage with regular expressions, but much of what I'm seeing doesn't quite correlate with what I'm used to (from Notepad++, Directory Opus, etc.), and I can't find anything in the documentation to explain the things that are eluding me, or even about ExifTool's regex syntax and usage in general. For example:

https://exiftool.org/forum/index.php?topic=10200.msg57361#msg57361

In this post by StarGeek  on 30 Jan 2020), the following code is suggested, which uses regex to extract a substring from a filename and write it to a Comment tag:

-comment<${filename;m/\[([^]]+)\]/;$_=$1}

An informative breakdown is given, but it doesn't explain EVERYTHING, to me, anyway. In particular, I don't know what exactly ;m/ and /;$_=$1 do, and don't see anything like them in the documentation, though maybe I just don't know where to look. Searching the documentation for "regular expressions" or "regex" was fruitless.

Another example:

https://exiftool.org/forum/index.php?topic=10599.msg56159#msg56159

Posted by Phil on 15 Nov 2019, the following code is suggested, which extracts a date string from a filename and uses it to set DateTimeOriginal:

exiftool "-datetimeoriginal<${filename;s/WA.*//}000000" DIR

User xiftheflour then asks what ;s/ and .*// do, to which StarGeek replies that "s/WA.*// is a Regular Expression (RegEx) substitution. It substitutes what is matched between the first and second slashes with what is between the second and third slashes."

So the ;s/ is part of ExifTool's regex syntax for initiating such a substitution? Where can I find documentation which explains that, and other fine points of ExifTool's regex implementation?

StarGeek ends his post: "RegEx is a complex subject all by itself and there are plenty of tutorials on the web if you wish to learn more." But which tutorial(s) are specifically relevant to using it with ExifTool on the command line?

Looking at both (and other) examples, I'm thinking that the first and last / mark the boundaries of the regex matching expression code, right?

In Notepad++, I would have used this code to match the same thing:

(.*)WA.*

Anybody?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 08, 2020, 01:51:58 PM
Quote from: mazeckenrode on June 08, 2020, 12:11:29 PM
Ok, so I'm trying to make sense of various examples of ExifTool usage with regular expressions, but much of what I'm seeing doesn't quite correlate with what I'm used to (from Notepad++, Directory Opus, etc.)

The regex you want to search on would be Perl Regex.  For the most part, it should be the same as with Notepad++.  I can't really think of any differences offhand except Perl's lower/uppercase modifiers.  I often use the same regex in both exiftool and Notepad++.  You can also use sites like RegEx101.com (https://regex101.com/) to test things out.

QuoteIn this post by StarGeek  on 30 Jan 2020), the following code is suggested, which uses regex to extract a substring from a filename and write it to a Comment tag:

-comment<${filename;m/\[([^]]+)\]/;$_=$1}

An informative breakdown is given, but it doesn't explain EVERYTHING, to me, anyway. In particular, I don't know what exactly ;m/ and /;$_=$1 do, and don't see anything like them in the documentation, though maybe I just don't know where to look. Searching the documentation for "regular expressions" or "regex" was fruitless.

This is perl code.  The m/<regex>/ is a perl RegEx Match expression.  It just checks for a match, possibly with capture groups.  In that example it captures the text between two brackets.  This gets saved to the $1 variable. The semicolon is what Perl uses to indicate the end of a command.  The next command assigns the capture variable to Perl's default variable (https://perlmaven.com/the-default-variable-of-perl).  When used inside the braces of a tag name in exiftool, this default variable will start with the value of the tag.  So, altogether, this command takes the value of the Filename tag, does a RegEx match for the text between two brackets, captures that text, and then assigns it to the Filename tag.

QuotePosted by Phil on 15 Nov 2019, the following code is suggested, which extracts a date string from a filename and uses it to set DateTimeOriginal:

exiftool "-datetimeoriginal<${filename;s/WA.*//}000000" DIR

User xiftheflour then asks what ;s/ and .*// do, to which StarGeek replies that "s/WA.*// is a Regular Expression (RegEx) substitution. It substitutes what is matched between the first and second slashes with what is between the second and third slashes."

So the ;s/ is part of ExifTool's regex syntax for initiating such a substitution? Where can I find documentation which explains that, and other fine points of ExifTool's regex implementation?

Again, this is Perl.  In this case a substitution instead of a simple match.

QuoteStarGeek ends his post: "RegEx is a complex subject all by itself and there are plenty of tutorials on the web if you wish to learn more." But which tutorial(s) are specifically relevant to using it with ExifTool on the command line?

Any Perl tutorial would be relevant.  I learned the basics of RegEx through Regular-Expressions.info (https://www.regular-expressions.info/) which isn't specifically about Perl, but does mention it frequently.  perldoc.perl.org is the documentation for Perl, but I find it is more technically and less of a tutorial.  I learned the most from PerlMaven.com and lots of copy/pasting from various StackExchange questions.

QuoteLooking at both (and other) examples, I'm thinking that the first and last / mark the boundaries of the regex matching expression code, right?
Yes.  If you look at RegEx testing sites like RegEx101 and RegExr.com, they even have the slashes as part of the input fields.  See this tutorial (https://riptutorial.com/regex/example/15849/-delimiters-) (only just found it).

QuoteIn Notepad++, I would have used this code to match the same thing:
(.*)WA.*

You could use that, but then you have to add $1 in the Replace With box.  The original WA.* could be directly placed in the Notepad++ Find What box and the Replace With box would be left blank.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 08, 2020, 03:33:20 PM
Quote from: StarGeek on June 08, 2020, 01:51:58 PMThe regex you want to search on would be Perl Regex. For the most part, it should be the same as with Notepad++.

Thanks much for all the detailed info. I guess my use of regex is limited enough that once I learned one way of doing things in it, I skipped past any alternate methods. All the regex I've done has been with \1 etc. for capture variables, and I've never had reason to know about or use ;m/ or ;s/. I think I now have enough to actually make some headway — though don't be surprised if I need to check in again for something!
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 10, 2020, 02:39:33 PM
Some progress, and not, with this project so far...

Successfully used this to set several date/time tags from part of filename:

exiftool "-XMP:DateTimeOriginal<${filename;s/ - Document - *//}" "-EXIF:DateTimeOriginal<${filename;s/ - Document - *//}" "-IPTC:DateCreated<${filename;s/ - Document - *//}" .

Will probably make a shortcut for all three tags in config soon.

Based on examples seen at https://exiftool.org/forum/index.php?topic=11183.0 , tried using the following code to replace 1, in substring p 1/10, with 2:

exiftool -tagsfromfile @ -XMP:UserComment -EXIF:XPComment -api "filter=s/p \d+\/10\..*/2/g" .

...but getting warning: [minor] Fixed incorrect list type for XMP-exif:UserComment.

Note that I am using Directory Opus to initially set any metadata in these files, then intend to use ExifTool to selectively modify them. The reason I do it that way is because I want the metadata to be visible in Directory Opus, but DOpus expects it to be in other than the officially specified location in PNGs, as outlined in https://exiftool.org/forum/index.php?topic=9262.0 , and ExifTool will reportedly update it in the same location. When I place content in DOpus' metadata field named Comment, then use ExifTool to see what went where, it tells me that the same text went to both XMP:UserComment and EXIF:XPComment.

ExifTool's page for XMP tags ( https://exiftool.org/TagNames/XMP.html ) says that XMP:UserComment is a lang-alt field. EXIF tags page ( https://exiftool.org/TagNames/EXIF.html ) states EXIF:XPComment is an int8u field, but don't see an explanation of it there. Unclear how either of those figure into the warning above.

Anyway, I'm trying to cobble together a string of code that will replace the page number in either/all Comment tags (within substring p #/#) with the page number extracted from the filename (using regex). I've found examples of setting a tag's value to something extracted from filename or other tag via regex, and of setting tag's value via -tagsfromfile, but I'm not finding any examples similar enough to what I want to do, and am not looking to get any tag data from a file other than the one being operated on. Not possible?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 10, 2020, 05:41:41 PM
Quote from: mazeckenrode on June 10, 2020, 02:39:33 PM
exiftool -tagsfromfile @ -XMP:UserComment -EXIF:XPComment -api "filter=s/p \d+\/10\..*/2/g" .

...but getting warning: [minor] Fixed incorrect list type for XMP-exif:UserComment.

Minor warning which you can ignore.  That tag was written incorrectly.  Exiftool fixed.

QuoteExifTool's page for XMP tags ( https://exiftool.org/TagNames/XMP.html ) says that XMP:UserComment is a lang-alt field. EXIF tags page ( https://exiftool.org/TagNames/EXIF.html ) states EXIF:XPComment is an int8u field, but don't see an explanation of it there. Unclear how either of those figure into the warning above.

It should be pointed out that XMP:UserComment is the XMP version of the EXIF:UserCommentXPComment is the Microsoft version of EXIF:UserComment.  It's not common for software to read XPComment, though there is some.

QuoteAnyway, I'm trying to cobble together a string of code that will replace the page number in either/all Comment tags (within substring p #/#) with the page number extracted from the filename (using regex). I've found examples of setting a tag's value to something extracted from filename or other tag via regex, and of setting tag's value via -tagsfromfile, but I'm not finding any examples similar enough to what I want to do, and am not looking to get any tag data from a file other than the one being operated on. Not possible?

I'm not quite following.  Can you give an example?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 10, 2020, 08:49:14 PM
Quote from: StarGeek on June 10, 2020, 05:41:41 PMMinor warning which you can ignore. That tag was written incorrectly. Exiftool fixed.

Actually, it did not update the tags, that I recall (not at that computer just now, but I'll verify later), according to new JSONs I vaguely remember exporting following the operation. At least, the previous values were still being displayed by Directory Opus.

QuoteIt should be pointed out that XMP:UserComment is the XMP version of the EXIF:UserComment. XPComment is the Microsoft version of EXIF:UserComment.

This is a rhetorical question unless you happen to know the answer, but why have a Microsoft version that is also an EXIF tag?

Also, can you tell me what int8u means? I'm guessing it's something to do with Unicode. Any idea why I got the warning for exiftool -tagsfromfile @ -XMP:UserComment -EXIF:XPComment -api "filter=s/p \d+\/10\..*/2/g" .?

QuoteI'm not quite following. Can you give an example?

In my first post in this thread, I gave an example of 10 document page scans named "2020-05-31 17;00;00 - Document - 01.png", "2020-05-31 17;00;00 - Document - 02.png", etc. I would first set the relevant Comment, Description, and Subject tags to something like Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10, using Directory Opus. I would like to then be able to run an ExifTool command line/batch/script which replaces p 1/10 in pages 2–10 with p 2/10...p 10/10.

I'd also like to have my procedure shift the seconds for EXIF/XMP:DateTimeOriginal and IPTC:DateCreated in pages 1–9 so that each page is tagged as one second later than the previous page, with the final result as below:

Page 1: 16:59:01
Page 2: 16:59:02
...
Page 10: 17:00:00

The correct number of seconds to shift down for each page could be calculated from <numberofpages> - <pagenumber>, and I assume I'd need some kind of script to accomplish it in a loop. I just started playing with Python, but all my effort with it so far has been limited to regex in Notepad++. Any reason I couldn't do what I need with Python? I'm aware that ExifTool itself is Perl (I'm using the Windows executable version). I'm open to recommendations, but wish to minimize how many different script languages I need to familiarize myself with. Ultimately, I want to be able to launch it from within Directory Opus.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: Phil Harvey on June 10, 2020, 09:09:57 PM
Quote from: mazeckenrode on June 10, 2020, 08:49:14 PM
Also, can you tell me what int8u means?

Unsigned 8-bit integer.

QuoteAny idea why I got the warning for exiftool -tagsfromfile @ -XMP:UserComment -EXIF:XPComment -api "filter=s/p \d+\/10\..*/2/g" .?

The UserComment tag should be a lang-alt list in XMP.  If ExifTool warns about an incorrect list type, then it wasn't a lang-alt list as it should have been in the original file.

- Phil
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 11, 2020, 11:45:31 AM
Quote from: mazeckenrode on June 10, 2020, 08:49:14 PM
This is a rhetorical question unless you happen to know the answer, but why have a Microsoft version that is also an EXIF tag?

Back in the XP days (I think), Microsoft made a few tags, XPAuthor, XPComment, XPKeywords, XPSubject, and XPTitle, even though there were already comparable tags in other official specs. I don't know why and your question got me curious.  But I couldn't find any information in my google searches.  In fact, most search results on these tags ended up mentioning exiftool in some way.

QuoteI would like to then be able to run an ExifTool command line/batch/script which replaces p 1/10 in pages 2–10 with p 2/10...p 10/10.

Ah, ok, a bit long and messy because you have to split each tag and insert the results of Filename in between, but something like
"-UserComment<${Usercomment;s/\d+\/\d+$//}${Filename;m/0*(\d+)\./;$_=$1}/${UserComment;m/(\d+$)/;$_=$1}"
or
"-UserComment<${Usercomment;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+$)/$temp$1/}"
Repeat for each tag, replacing UserComment with the other tag names.  Test carefully, any variation from the format you listed (extra spaces for example) may end up with unexpected results.

QuoteI'd also like to have my procedure shift the seconds for EXIF/XMP:DateTimeOriginal and IPTC:DateCreated in pages 1–9 so that each page is tagged as one second later than the previous page, with the final result as below:

IPTC:DateCreated doesn't have a time component, so I assume you mean IPTC:TimeCreated, unless you mean XMP:DateCreated, which is part of the IPTC Core schema and does have a time component.  That could be done with
"-DateTimeOriginal+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}"

QuotePage 1: 16:59:01
Page 2: 16:59:02
...
Page 10: 17:00:00

Don't you mean "Page 10: 16:59:10" ?  Because otherwise I don't understand your math here.  Plus one second except for the last page where it's +1 minute?  What if there is more than 60 pages?

Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 11, 2020, 09:29:08 PM
Quote from: Phil Harvey on June 10, 2020, 09:09:57 PMUnsigned 8-bit integer.

EXIF:XPComment is supposed to be an unsigned 8-bit integer, but is the Microsoft version of EXIF:UserComment, which is undefined (presumably a string)? What would an integer placed in EXIF:XPComment represent?

QuoteThe UserComment tag should be a lang-alt list in XMP.

It seems that some tags' names are hardly descriptive of their intended purposes. Anyway, if Directory Opus is inserting the same value into both XMP:UserComment and EXIF:XPComment, but leaving EXIF:UserComment alone, is that an example of DOpus' metadata library not keeping up with current specs, or abusing its privileges?

I don't suppose it's possible to force ExifTool to write XMP:UserComment as a string instead of a lang-alt list? Reason being, I want to keep DOpus' mouse-hover displaying of metadata consistent, and I'm not sure having XMP:UserComment different from EXIF:XPComment won't affect it. I'll test that. Stay tuned.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 11, 2020, 09:31:41 PM
Quote from: StarGeek on June 11, 2020, 11:45:31 AM
a bit long and messy because you have to split each tag and insert the results of Filename in between, but something like

Ok, thanks much. These both worked for me, with EXIF:XPComment substituted for UserComment, as far as updating that one tag. I'm going to need to tweak the code for some cases where I have additional text after p #/#, as it did not work for that.

I did not try using similar code to update XMP:UserComment due to it being intended as a lang-alt list, and thankfully found that Directory Opus successfully displayed the new value for EXIF:XPComment. So far, so good.

Curiously, I found that both command lines resulted in some additional EXIF and IPTC tags being created in all files:


EXIF:XResolution
EXIF:YResolution
EXIF:ResolutionUnit
EXIF:YCbCrPositioning
IPTC:Caption-Abstract


Is this supposed to happen? If so, Why?

Another curious thing had happened when I was creating the PNG files to be experimented upon with these ExifTool instructions. I started by using DOpus to create metadata, in the form of <dopus field name>, in all fields that I would typically use for these scans. For example, in the field that DOpus calls Author, I put <author> (including the angle brackets as shown). I then used ExifTool to export all tags to JSON, so I could see which actual tag names were used for what data. The JSON files showed a handful of the EXIF tag values preceded by a question mark:


"EXIF:Artist": "?<author>",
"EXIF:Copyright": "?<copyright>",
"EXIF:Artist": "?<author>",
"EXIF:Copyright": "?<copyright>",
"EXIF:ImageDescription": "?<description>",
"EXIF:Make": "?<camera make>",
"EXIF:Model": "?<camera model>",
"EXIF:Software": "?<creation software>",


The corresponding XMP tags, also created by DOpus, did NOT have a question mark.

I then used DOpus again, to set the tags values I wanted to be operated on by my ExifTool code experiments (Document Test, prepared by MAZE, 31 May 2020, 17:00:00, p 1/10; Digitization by MAZE, 11 Jun 2020, later truncated to Document Test, prepared by MAZE, 31 May 2020, 17:00:00, p 1/10, in applicable comment tags). No question marks appeared in any tags exported from these updated PNGs, not even in the tags I hadn't changed since the earlier JSON exports. Anybody have an explanation?

QuoteIPTC:DateCreated doesn't have a time component, so I assume you mean IPTC:TimeCreated

I did see, in the page for IPTC tags ( https://exiftool.org/TagNames/IPTC.html ), that there are separate tags for time and date, but DOpus apparently sets both (as well as EXIF:DateTimeOriginal and XMP:DateTimeOriginal) with the same stroke, and ExifTool reports IPTC:DateCreated with both date and time values in the exported JSONs. If I can't set them both together using that one tag name, why are they exported together like that? Aren't I supposed to be able to import all writeable tags from these JSON files?

QuoteThat could be done with "-DateTimeOriginal+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}"

I tried:

exiftool "-EXIF:DateTimeOriginal+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}" "-XMP:DateTimeOriginal+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}" "-IPTC:TimeCreated+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}" .

...but that gave me:

Warning: No writable tags set from...

QuoteDon't you mean "Page 10: 16:59:10"?

Brain fart, my bad. :-)  Should have been:

Page 1: 16:59:51
Page 2: 16:59:52
...
Page 10: 17:00:00
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: Phil Harvey on June 11, 2020, 09:57:44 PM
Quote from: mazeckenrode on June 11, 2020, 09:29:08 PM
EXIF:XPComment is supposed to be an unsigned 8-bit integer, but is the Microsoft version of EXIF:UserComment, which is undefined (presumably a string)? What would an integer placed in EXIF:XPComment represent?

The format for this tag is completely meaningless.  It is actually stored as UCS2-LE, which has 16-bit characters.  But of course, everything can be represented as a string of bytes.  Don't ask me why Microsoft uses an int8u format code.  In my opinion their programmers must have been on drugs when they made this decision.

QuoteI don't suppose it's possible to force ExifTool to write XMP:UserComment as a string instead of a lang-alt list?

You could do this by overriding the tag with a user-defined tag.  But then this would violate the XMP specification, which could cause problems down the line.

- Phil
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 12, 2020, 12:04:55 AM
Quote from: mazeckenrode on June 10, 2020, 08:49:14 PM
Quote from: StarGeek on June 10, 2020, 05:41:41 PMMinor warning which you can ignore. That tag was written incorrectly. Exiftool fixed.

Actually, it did not update the tags, that I recall (not at that computer just now, but I'll verify later), according to new JSONs I vaguely remember exporting following the operation. At least, the previous values were still being displayed by Directory Opus.

It wouldn't be something you would see unless you viewed the raw XMP.  Whatever originally wrote the tag did so incorrectly and Exiftool fixed it.  As an example there is this thread (https://exiftool.org/forum/index.php?topic=7324.0) about Microsoft XMP.  At some point they decided to change the URI.  The difference is xmlns:MicrosoftPhoto="http://ns.microsoft.com/photo/1.0/" vs. xmlns:MicrosoftPhoto="http://ns.microsoft.com/photo/1.0".  Just a single trailing slash.

Quote
QuoteThe UserComment tag should be a lang-alt list in XMP.

It seems that some tags' names are hardly descriptive of their intended purposes. Anyway, if Directory Opus is inserting the same value into both XMP:UserComment and EXIF:XPComment, but leaving EXIF:UserComment alone, is that an example of DOpus' metadata library not keeping up with current specs, or abusing its privileges?

They're not abusing anything, it's just what they decided to program.  Understand, UserComment isn't really part of any modern standard to begin with.  It's sort of a weird outlier tag, IMO.  The entirety of its description in the EXIF spec is just "Comments from user".  XMP:Description/IPTC:Caption-Abstract are the more commonly used tags and part of the IPTC Photo Metadata Spec (https://www.iptc.org/std/photometadata/specification/IPTC-PhotoMetadata).

QuoteI don't suppose it's possible to force ExifTool to write XMP:UserComment as a string instead of a lang-alt list? Reason being, I want to keep DOpus' mouse-hover displaying of metadata consistent, and I'm not sure having XMP:UserComment different from EXIF:XPComment won't affect it. I'll test that. Stay tuned.
...
QuoteI did not try using similar code to update XMP:UserComment due to it being intended as a lang-alt list,

It's really not something you need to worry about.  A lot of the XMP tags are "lang-alt".  But unless you are actually adding alternate language text, it's basically just a string that gets returned.

As an example, if you set Description to say "Picture of a cow", then that is what most software will return.  But if you knew that someone in Italy would need to read it in Italian, you could add (runs to google translate) -Description-it="Immagine di una mucca". "Picture of a cow" is the default and is what will be returned unless the Italian version was asked for.

Quote from: mazeckenrode on June 11, 2020, 09:31:41 PMCuriously, I found that both command lines resulted in some additional EXIF and IPTC tags being created in all files:
EXIF:XResolution
EXIF:YResolution
EXIF:ResolutionUnit
EXIF:YCbCrPositioning
IPTC:Caption-Abstract

Is this supposed to happen? If so, Why?

The EXIF tags are mandatory according to the EXIF spec (see the EXIF tag page (https://exiftool.org/TagNames/EXIF.html)).  Caption-Abstract should not be written unless it is specifically called for or MWG tags are used.

QuoteThe JSON files showed a handful of the EXIF tag values preceded by a question mark:
...
No question marks appeared in any tags exported from these updated PNGs, not even in the tags I hadn't changed since the earlier JSON exports. Anybody have an explanation?

No idea without actually seeing a file and figuring out what the actual character was (might not have been a ASCII question mark), but it sounds like Opus is writing something weird.

QuoteI did see, in the page for IPTC tags ( https://exiftool.org/TagNames/IPTC.html ), that there are separate tags for time and date, but DOpus apparently sets both (as well as EXIF:DateTimeOriginal and XMP:DateTimeOriginal) with the same stroke, and ExifTool reports IPTC:DateCreated with both date and time values in the exported JSONs. If I can't set them both together using that one tag name, why are they exported together like that? Aren't I supposed to be able to import all writeable tags from these JSON files?

Again, I'd have to see a file to correctly figure it out, but off hand it sounds like Opus is incorrectly writing the data. 

QuoteThat could be done with "-DateTimeOriginal+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}"

Oops, my mistake.  doubled up on the +< due to copy/paste error.  This should work
"-DateTimeOriginal+<0:0:${Filename;m/0*(\d+)\./;$_=$1}"
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 12, 2020, 09:17:20 PM
Quote from: Phil Harvey on June 11, 2020, 09:57:44 PM
Quote from: mazeckenrode on June 11, 2020, 09:29:08 PM
I don't suppose it's possible to force ExifTool to write XMP:UserComment as a string instead of a lang-alt list?

You could do this by overriding the tag with a user-defined tag. But then this would violate the XMP specification, which could cause problems down the line.

Noted, thanks. I'll give it a whirl some time, at least to see if I can make it work.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 12, 2020, 09:25:04 PM
Quote from: StarGeek on June 12, 2020, 12:04:55 AM
The EXIF tags are mandatory according to the EXIF spec (see the EXIF tag page (https://exiftool.org/TagNames/EXIF.html)). Caption-Abstract should not be written unless it is specifically called for or MWG tags are used.

I don't really care that much that they got added, but remain curious as to why IPTC:Caption-Abstract appeared when my command line code didn't call for it.

Quote
QuoteThe JSON files showed a handful of the EXIF tag values preceded by a question mark:

No idea without actually seeing a file and figuring out what the actual character was (might not have been a ASCII question mark), but it sounds like Opus is writing something weird.

Well, I haven't been able to reproduce it. One of life's mysteries, I guess.

Quote
QuoteThat could be done with "-DateTimeOriginal+<+<0:0:${Filename;m/0*(\d+)\./;$_=$1}"

Oops, my mistake.  doubled up on the +< due to copy/paste error. This should work
"-DateTimeOriginal+<0:0:${Filename;m/0*(\d+)\./;$_=$1}"

Didn't notice until I tried it, but that adds seconds instead of subtracting the total of <numberofpages>-<pagenumber>. Tag-operations (https://exiftool.org/exiftool_pod.html#Tag-operations) says:

"When shifting a value, the shift is applied to the original value of the tag, overriding any other values previously assigned to the tag on the same command line."

...so it sounds like I can't do something like
exiftool "-EXIF:DateTimeOriginal+<0:0:${Filename;m/0*(\d+)\./;$_=$1}" "-EXIF:DateTimeOriginal-=0:0:10"
and expect both halves of that to have an effect. Is there a way to perform the necessary calculation within the single command line code, or am I going to have to issue two command lines for each tag I want to shift? (Until I manage to get a script together which performs the calculation, then feeds the correct number for subtraction to ExifTool.) I'm guessing I can't do the former, and haven't found any examples of that sort of thing so far, anyway.

Also, trying but failing to adequately reverse engineer your previously suggested code, modified by me as follows:

exiftool "-EXIF:XPComment<${EXIF:XPComment;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+$)/$temp$1/}" .

...in order to make it match even when there is additional text after p #/#. I don't understand everything going on in your code, such as the \.. In the regex I know, that's an escaped period after the capture group, otherwise known as a literal period. But there is no period in my example comment field Document Test, prepared by MAZE, 31 May 2020, 17:00:00, p 1/10; Digitization by MAZE, 11 Jun 2020, so what does that do, and how to match with or without additional text? Sorry to be so dense, I even tried consulting the regex web pages you suggested in a previous post, but probably not seeing the forest for the trees.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 14, 2020, 11:15:27 AM
Quote from: mazeckenrode on June 12, 2020, 09:25:04 PM
Didn't notice until I tried it, but that adds seconds instead of subtracting the total of <numberofpages>-<pagenumber>.

Ahh, I misunderstood you. I was still going off your earlier post which looked like you were just adding seconds (Page 1: 16:59:01 Page 2: 16:59:02). That is more complicated.

Just get things right, the sequence is this?
1. Grab the page number from the file name
2. Grab the total number of pages from XPComment
3. Subtract page number from total number of pages
4. Subtract that result from the time stamp.

QuoteAlso, trying but failing to adequately reverse engineer your previously suggested code, modified by me as follows:
exiftool "-EXIF:XPComment<${EXIF:XPComment;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+$)/$temp$1/}" .
...in order to make it match even when there is additional text after p #/#

Change
s/\d+(\/\d+$)/$temp$1/
into
s/\d+(\/\d+)/$temp$1/

The dollar sign anchored the expression to the end of the string.  Removing it allow the ##/## part to be any where.  But you then need to be careful that the ##/## pattern doesn't appear anywhere else in the string

QuoteI don't understand everything going on in your code, such as the \.. In the regex I know, that's an escaped period after the capture group, otherwise known as a literal period.

It's matching the dot in the filename.  Semicolons are how Perl indicates end of commands so take the whole section between the semicolons.
my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./
my $temp creates a temporary variable.  It will be set to the value of $1, the capture group from the following regex if there is a match.  $self->GetValue('FileName') gets the value of the FileName while doing processing inside the XPComment tag.  The regex then matches the last digits before the DotExtension of the filename.

Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 14, 2020, 09:00:46 PM
Quote from: StarGeek on June 14, 2020, 11:15:27 AM
I was still going off your earlier post which looked like you were just adding seconds (Page 1: 16:59:01 Page 2: 16:59:02).

Well, not really. The 'base' time, from which the others were to be calculated, was 17:00:00.

QuoteThat is more complicated.

Figured that.  :-/  Though not impossible via ExifTool alone? I'm surprised, but hopeful, if that's the case.

Quote
Just get things right, the sequence is this?
1. Grab the page number from the file name
2. Grab the total number of pages from XPComment
3. Subtract page number from total number of pages
4. Subtract that result from the time stamp.

Yes, that's correct.

QuoteThe dollar sign anchored the expression to the end of the string.

Doh! I should've been able to figure that one out myself. All these dollar signs keep throwing me off, since they don't all mean the same thing.

QuoteBut you then need to be careful that the ##/## pattern doesn't appear anywhere else in the string

Surely there's a way to make it match the first one only? (Though I don't anticipate needing to have that pattern more than once in a filename.) Pretty sure I did that in Notepad++ regex at some point before. I don't think it's something I have in a script, though, which are all fairly recent.

QuoteIt's matching the dot in the filename.

Double-doh! I REALLY should have seen that one! I'm really embarrassing myself here, aren't I?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 18, 2020, 01:22:48 PM
Sorry for the delay in getting back to you, life gets in the way sometimes.

Try this, replacing TAG with the tag that has the page number data.  You can repeat the first part for each tag you need to edit
exiftool "-TAG<${TAG;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+)/$temp$1/}" "-DateTimeOriginal-<0:0:${TAG;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;m/\d+\/(\d+)/;$_=$1-$temp}" /path/to/files/

Example output.  I used Caption-Abstract as a tag that had extra stuff at the end, including another #/# pattern.  Description has the original format.
C:\>exiftool -s -Caption-Abstract -Description -DateTimeOriginal Y:\!temp\bbbb\2020*
======== Y:/!temp/bbbb/2020-05-31 17;00;00 - Document - 01.jpg
Caption-Abstract                : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10 junk junk 5/50 junk
Description                     : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10
DateTimeOriginal                : 2020:05:31 17:00:00
======== Y:/!temp/bbbb/2020-05-31 17;00;00 - Document - 02.jpg
Caption-Abstract                : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10 junk junk 5/50 junk
Description                     : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10
DateTimeOriginal                : 2020:05:31 17:00:00
    2 image files read

C:\>exiftool -P -overwrite_original Y:\!temp\bbbb\2020* "-Description<${Description;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+)/$temp$1/}" "-Caption-Abstract<${Caption-Abstract;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+)/$temp$1/}" "-DateTimeOriginal-<0:0:${Description;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;m/\d+\/(\d+)/;$_=$1-$temp}"
    2 image files updated

C:\>exiftool -s -Caption-Abstract -Description -DateTimeOriginal Y:\!temp\bbbb\2020*
======== Y:/!temp/bbbb/2020-05-31 17;00;00 - Document - 01.jpg
Caption-Abstract                : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10 junk junk 5/50 junk
Description                     : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 1/10
DateTimeOriginal                : 2020:05:31 16:59:51
======== Y:/!temp/bbbb/2020-05-31 17;00;00 - Document - 02.jpg
Caption-Abstract                : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 2/10 junk junk 5/50 junk
Description                     : Document XYZ, prepared by John Q Public, 31 May 2020, 17:00:00, p 2/10
DateTimeOriginal                : 2020:05:31 16:59:52
    2 image files read
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 19, 2020, 10:33:17 PM
Quote from: StarGeek on June 18, 2020, 01:22:48 PM
Sorry for the delay in getting back to you, life gets in the way sometimes.

Hey, no apology necessary. I really appreciate your help.

Quote
Try this, replacing TAG with the tag that has the page number data.

That worked like a charm in my test. Thank you! With some time and work on my part, I hope to eventually understand what all the parts of it (the ones I don't already understand) do, so I can apply them on my own.

If I haven't already worn out my welcome, I'd like to bug you (or somebody more knowledgeable than me) for something else related to this thread. I previously learned here that Directory Opus was writing EXIF/XMP:DateTimeOriginal and IPTC:DateCreated in the same operation when applied from date/time values in the DOpus Set Metadata panel's Date taken field, but that IPTC:DateCreated is officially intended for date only, and IPTC:TimeCreated should be used for any time component.

In the interest of maintaining maximum consistency and compatibility between what I have DOpus and ExifTool doing to the same sets of files, and also recognizing that I will sometimes need to amend the date and time I assign to some files, I'd like to be able to parse and capture both components from my filenames (which I would amend first) and use them to update/write all four metadata fields. It seems to be easy enough to do for EXIF/XMP:DateTimeOriginal, but for IPTC:DateCreated and IPTC:TimeCreated, I'm going to have to separate the date and time. On top of that, I need to factor in the fact that I will sometimes be dealing with files (generally those provided by others) using variations of my chosen naming convention ("YYYY-MM-DD HH;MM;SS - <descriptive name> - <page>"). I hope to encompass date/time expressions utilizing any of [-_;., ] (and maybe some I haven't thought of yet), or none of them, in any combination as separators. In Notepad++, I would use this to match and capture:

(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})

If there's a smarter way to do this in ExifTool, I'm open, but what I'm mostly concerned with is how to apply the multiple capture groups from that. I tried searching the forums here for examples of multiple capture groups being used/applied, but have come up empty handed so far (I tried searching for $2, but that only returned a pageful of topic threads with the number 2 in their respective titles). I still don't have my head wrapped around the various uses of $, having now seen $1, $temp=$1, $self, $temp$1, $_, etc.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 22, 2020, 04:07:48 PM
Quote from: mazeckenrode on June 19, 2020, 10:33:17 PMI previously learned here that Directory Opus was writing EXIF/XMP:DateTimeOriginal and IPTC:DateCreated in the same operation when applied from date/time values in the DOpus Set Metadata panel's Date taken field, but that IPTC:DateCreated is officially intended for date only, and IPTC:TimeCreated should be used for any time component.

Can you make an example file available?  I'd like to look it over.

QuoteIn the interest of maintaining maximum consistency and compatibility between what I have DOpus and ExifTool doing to the same sets of files, and also recognizing that I will sometimes need to amend the date and time I assign to some files, I'd like to be able to parse and capture both components from my filenames (which I would amend first) and use them to update/write all four metadata fields. It seems to be easy enough to do for EXIF/XMP:DateTimeOriginal, but for IPTC:DateCreated and IPTC:TimeCreated, I'm going to have to separate the date and time.

Actually, you don't.  Exiftool will figure it out.  As long as you have the 14 digits in the correct order, that actual format doesn't matter because exiftool is very flexible when it comes to time stamps.  See the 3rd paragraph under FAQ #5 (https://exiftool.org/faq.html#Q5)
     Having said this, ExifTool is very flexible about the actual format of input date/time values when writing, and will attempt to reformat any values into the standard format unless the -n option is used. Any separators may be used (or in fact, none at all). The first 4 consecutive digits found in the value are interpreted as the year, then next 2 digits are the month, and so on.

So if either the filename or another tag such as DateTimeOriginal is already correct, you can just copy it straight across.  So you could just use
"-IPTC:DateCreated<DateTimeOriginal" "-IPTC:TimeCreated<DateTimeOriginal"

Or if the tags are being set from something else, for example a variable in a batch file
"-IPTC:DateCreated=2020:06:22 12:00:00" "-IPTC:TimeCreated=2020:06:22 12:00:00"

There is one caveat, though.  And that is IPTC:TimeCreated needs a time zone value.  If one is not provided, then exiftool will use the local time zone.

QuoteI still don't have my head wrapped around the various uses of $, having now seen $1, $temp=$1, $self, $temp$1, $_, etc.

For the regex capture groups, each capture group gets assigned to a $Number.  It's the same as in Notepad++, though that also allows the use of a \Number

The $temp was just a variable created in the command with the perl my function (https://perldoc.perl.org/functions/my.html).  Basically using straight perl at that point.  $temp$1 is just concatenating the temp variable and the regex captured variable.

The $_ is the Perl default variable (https://perlmaven.com/the-default-variable-of-perl).  When used inside the braces of a tag, it represents the value of the tag.  If you assign a value to it, then the contents of the tag becomes that value.

Don't bother with the $self variable except for the examples shown.  That's used to directly access the Image::ExifTool Perl Library Module (https://exiftool.org/ExifTool.html).  In most cases you don't need to bother with it.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 22, 2020, 07:40:49 PM
Quote from: StarGeek on June 22, 2020, 04:07:48 PM
Quote from: mazeckenrode on June 19, 2020, 10:33:17 PMI previously learned here that Directory Opus was writing EXIF/XMP:DateTimeOriginal and IPTC:DateCreated in the same operation when applied from date/time values in the DOpus Set Metadata panel's Date taken field, but that IPTC:DateCreated is officially intended for date only, and IPTC:TimeCreated should be used for any time component.

Can you make an example file available? I'd like to look it over.

Certainly, see attached.

FYI, When I brought up about DOpus writing EXIF, XMP, and IPTC tags under certain circumstances, one of their developers replied: "We don't add IPTC fields unless some are already there." (DOpus forum thread here (https://resource.dopus.com/t/set-metadata-inconsistent-results/35872).) But ExifTool said no metadata prior to my operation, and all three groups after.

Quote
QuoteI'd like to be able to parse and capture both components from my filenames (which I would amend first) and use them to update/write all four metadata fields. It seems to be easy enough to do for EXIF/XMP:DateTimeOriginal, but for IPTC:DateCreated and IPTC:TimeCreated, I'm going to have to separate the date and time.

Actually, you don't. Exiftool will figure it out.

My experience here is that I'm not getting reliable results. Used this command line with a variety of possible filenames (most of which I wouldn't ordinarily use, but may encounter from other sources):

exiftool "-iptc:datecreated<filename" "-iptc:timecreated<filename" "-exif:datetimeoriginal" "-xmp:datetimeoriginal" .

Filename: "2020-06-22 16;36;58.png"

Command line output: Warning: Invalid time format (use HH:MM:SS[+/-HH:MM]) in IPTC:TimeCreated (ValueConvInv)
Warning: [minor] Creating non-standard IPTC in PNG


From exported JSON: "IPTC:DateCreated": "2020:06:22" (this was the only tag written for this file)

Filename: "2020-06-22 16;36;58Publication-Page19000101.png"

Command line output: Warning: Invalid time format (use HH:MM:SS[+/-HH:MM]) in IPTC:TimeCreated (ValueConvInv)
Warning: [minor] Creating non-standard IPTC in PNG


JSON:
"IPTC:DateCreated": "1900:01:01",
"IPTC:ApplicationRecordVersion": 4


Filename: "20200622-163658.png"

Command line output: Warning: [minor] Creating non-standard IPTC in PNG

JSON:
"IPTC:DateCreated": "0622:16:36",
"IPTC:TimeCreated": "20:20:06-16:36",
"IPTC:ApplicationRecordVersion": 4


Filenames:
"20200622_163658.png"
"20200622163658.png"

Command line output: Warning: [minor] Creating non-standard IPTC in PNG

JSON:
"IPTC:DateCreated": "2020:06:22",
"IPTC:TimeCreated": "20:20:06-04:00",
"IPTC:ApplicationRecordVersion": 4


Filename: "2020062216365812340506123456.png"

Command line output: Warning: [minor] Creating non-standard IPTC in PNG

JSON:
"IPTC:DateCreated": "0612:34:56",
"IPTC:TimeCreated": "20:20:06-04:00",
"IPTC:ApplicationRecordVersion": 4


Thanks, too, for the explanation of various $variables. I may have another question or two on that subject at some point, but will have to do some digging for other threads I'd looked at first.

Attached: "2020-10-11 12;13;14 - Document - 01.7z" (1,529)

Contents: "2020-10-11 12;13;14 - Document - 01.png" (4,186) [1 x 1 x 1]
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 23, 2020, 08:31:01 AM
Quote from: mazeckenrode on June 22, 2020, 07:40:49 PM
Used this command line with a variety of possible filenames:

exiftool "-iptc:datecreated<filename" "-iptc:timecreated<filename" "-exif:datetimeoriginal" "-xmp:datetimeoriginal" .

Just noticed that I'd failed to properly complete the EXIF and XMP instructions in the command line above. My bad. Should have been:

exiftool "-iptc:datecreated<filename" "-iptc:timecreated<filename" "-exif:datetimeoriginal<filename" "-xmp:datetimeoriginal<filename" .

Anyway, looks to me like the EXIF and XMP tags are getting the correct date and time, but the IPTC tags are not in any of my test cases, and are not even all consistent in which numbers they end up inheriting from the respective filenames.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 23, 2020, 10:20:54 AM
Quote from: mazeckenrode on June 22, 2020, 07:40:49 PM
Quote from: StarGeek on June 22, 2020, 04:07:48 PM
Can you make an example file available? I'd like to look it over.

Certainly, see attached.

Thanks.  I'll have to look it over in more detail when I get a change but it is being written incorrectly.  If I recall correctly, DO uses exiv2 to write image metadata (yep, it's in the Acknowledgements (https://www.gpsoft.com.au/help/opus12/Documents/Acknowledgements.htm)) so I think I'll see if exiv2 allows it to be written like that.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 23, 2020, 10:32:24 AM
Quote from: StarGeek on June 23, 2020, 10:32:24 AM
Quote from: mazeckenrode on June 22, 2020, 07:40:49 PM
My experience here is that I'm not getting reliable results. Used this command line with a variety of possible filenames (most of which I wouldn't ordinarily use, but may encounter from other sources):

Yeah, you're right, the IPTC:TimeCreated doesn't seem to work well when copying from the filename.  But it does need the time zone.  I've noticed another weirdness regarding it that I'm going to bring up in another thread. edit: which I now can't reproduce, ARRRGG

My suggestion, if you really want to use these IPTC tags, would be to copy the data from one of the EXIF tags like DateTimeOriginal or if you have the EXIF:OffsetTimeOriginal filled out (time zone in the EXIF group), copy from SubSecDateTimeOriginal.  I should also point out that IPTC data in a PNG file is non-standard and about the only things that will read it are exiftool and exiv2 (which, as I mentioned, is what DO uses).

Myself, I tend not to use these tags.  The EXIF and XMP tags are much better tags, though the IPTC ones do get filled out automatically in my workflow if I have the time zone.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 23, 2020, 12:30:41 PM
Quote from: StarGeek on June 23, 2020, 10:32:24 AM
Yeah, you're right, the IPTC:TimeCreated doesn't seem to work well when copying from the filename.

Trying to use these, based on other examples here, to take care of the IPTC tags, but still getting inconsistent and incorrect results:

exiftool "-iptc:datecreated<${filename;m/^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1$2$3}" .

"-iptc:timecreated<${filename;m/^\d{4}[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1$2$3}"

In particular, some of my files with more than 14 digits in the filenames are getting date and time values from digits that aren't among the first 14. Is my usage of ^ to anchor at the start of the filename not correct?

Quote
I tend not to use these tags. The EXIF and XMP tags are much better tags

I wish to maintain their presence and accuracy primarily because DOpus puts them there. I will be using DOpus, at least on occasion, to modify some tags, and don't want to have to worry about whether there are IPTC tags whose values contradict the corresponding EXIF and XMP tags.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 26, 2020, 10:24:45 AM
Whether I use:

exiftool "-iptc:datecreated<${filename;m/^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1$2$3}" "-iptc:timecreated<${filename;m/^\d{4}[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1$2$3}" .

...or...

exiftool "-iptc:datecreated<${filename;m/\A(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1$2$3}" "-iptc:timecreated<${filename;m/\A\d{4}[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1$2$3}" .

...(using ^ as start-of-string anchor in first command line, \A in second) I get the following errors:


Warning: syntax error for 'filename' - ./2020-06-22 16;36;58.png
Warning: [minor] Creating non-standard IPTC in PNG - ./2020-06-22 16;36;58.png
Warning: syntax error for 'filename' - ./2020-06-22 16;36;58Publication-Page19000101.png
Warning: [minor] Creating non-standard IPTC in PNG - ./2020-06-22 16;36;58Publication-Page19000101.png
Warning: syntax error for 'filename' - ./20200622-163658.png
Warning: [minor] Creating non-standard IPTC in PNG - ./20200622-163658.png
Warning: syntax error for 'filename' - ./20200622163658.png
Warning: [minor] Creating non-standard IPTC in PNG - ./20200622163658.png
Warning: syntax error for 'filename' - ./2020062216365812340506123456.png
Warning: [minor] Creating non-standard IPTC in PNG - ./2020062216365812340506123456.png
Warning: syntax error for 'filename' - ./20200622_163658.png
Warning: [minor] Creating non-standard IPTC in PNG - ./20200622_163658.png


Actual tagging results for the six files:

For IPTC:DateCreated, (2) correct, (4) incorrect

For IPTC:TimeCreated, (0) correct, (4) incorrect, (2) failed to set

If I use what I'd consider a basically equivalent regex operation in Notepad++:

Find: ^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*\d{2}.*

Replace with: Date: \1:\2:\3

Find: ^\d{4}[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2}).*

Replace with: Time: \1:\2:\3

...they capture and replace exactly what I intended. What's wrong with my ExifTool code?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 26, 2020, 11:13:14 AM
Took me too long to figure it out :(

You need the concatenation operator to combine the strings.  Or they need to be in double quotes, to concat, though that doesn't work so well on Windows
$_=$1.$2.$3
C:\>exiftool -P -overwrite_original  "-iptc:datecreated<${filename;m/^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1.$2.$3}" "-iptc:timecreated<${filename;m/^\d{4}[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1.$2.$3}"  Y:\!temp\bbbb
    1 directories scanned
    6 image files updated

C:\>exiftool -time:all --system:all -g1 -a -s Y:\!temp\bbbb
======== Y:/!temp/bbbb/2020-06-22 16;36;58.jpg
---- IPTC ----
DateCreated                     : 2020:06:22
TimeCreated                     : 16:36:58-07:00
---- Composite ----
DateTimeCreated                 : 2020:06:22 16:36:58-07:00
DateTimeOriginal                : 2020:06:22 16:36:58-07:00
======== Y:/!temp/bbbb/2020-06-22 16;36;58Publication-Page19000101.jpg
---- IPTC ----
DateCreated                     : 2020:06:22
TimeCreated                     : 16:36:58-07:00
---- Composite ----
DateTimeCreated                 : 2020:06:22 16:36:58-07:00
DateTimeOriginal                : 2020:06:22 16:36:58-07:00
======== Y:/!temp/bbbb/20200622-163658.png
---- IPTC ----
DateCreated                     : 2020:06:22
TimeCreated                     : 16:36:58-07:00
---- Composite ----
DateTimeCreated                 : 2020:06:22 16:36:58-07:00
DateTimeOriginal                : 2020:06:22 16:36:58-07:00
======== Y:/!temp/bbbb/20200622163658.jpg
---- IPTC ----
DateCreated                     : 2020:06:22
TimeCreated                     : 16:36:58-07:00
---- Composite ----
DateTimeCreated                 : 2020:06:22 16:36:58-07:00
DateTimeOriginal                : 2020:06:22 16:36:58-07:00
======== Y:/!temp/bbbb/2020062216365812340506123456.jpg
---- IPTC ----
DateCreated                     : 2020:06:22
TimeCreated                     : 16:36:58-07:00
---- Composite ----
DateTimeCreated                 : 2020:06:22 16:36:58-07:00
DateTimeOriginal                : 2020:06:22 16:36:58-07:00
======== Y:/!temp/bbbb/20200622_163658.jpg
---- IPTC ----
DateCreated                     : 2020:06:22
TimeCreated                     : 16:36:58-07:00
---- Composite ----
DateTimeCreated                 : 2020:06:22 16:36:58-07:00
DateTimeOriginal                : 2020:06:22 16:36:58-07:00
    1 directories scanned
    6 image files read


In Perl code $_="$1$2$3" would also work, but on the command line it's better to use the above, imo.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 26, 2020, 12:37:32 PM
Quote from: StarGeek on June 26, 2020, 11:13:14 AM
You need the concatenation operator to combine the strings. Or they need to be in double quotes, to concat, though that doesn't work so well on Windows
$_=$1.$2.$3
$_="$1$2$3" would also work, but on the command line it's better to use the above, imo.

Bingo, that works! Thanks for all the help, StarGeek (and Phil).

Pretty sure I saw some command line examples that didn't use any of the above solutions for concatenation in the forum here, which is what I was trying to base my code on, but perhaps those were intended for non-Windows systems.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 26, 2020, 12:40:48 PM
They were probabaly in a regex substitution like
s/^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/$1$2$3/
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 26, 2020, 12:53:38 PM
Quote from: StarGeek on June 26, 2020, 12:40:48 PM
They were probabaly in a regex substitution like
s/^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/$1$2$3/

Ah, yes, didn't think of distinguishing those two types of regex usage. So I'm guessing the concatenation-operator-or-quotes rule only applies if it follows the regex matching code and /;?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 26, 2020, 03:45:34 PM
Well, almost done, I think. My entire command line for setting EXIF/XMP:DateTimeOriginal and IPTC:DateCreated/TimeCreated, after having already used Directory Opus to set all of the above except IPTC:TimeCreated to something, is:

exiftool "-exif:datetimeoriginal<filename" "-xmp:datetimeoriginal<filename" "-iptc:datecreated<${filename;m/^(\d{4})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1.$2.$3}" "-iptc:timecreated<${filename;m/^\d{4}[-_;\., ]*\d{2}[-_;\., ]*\d{2}[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})[-_;\., ]*(\d{2})/;$_=$1.$2.$3}" .

Using this to set the EXIF tags results in:


Error: [minor] IFD0 pointer references previous IFD0 directory


I presume that's due to Directory Opus writing EXIF to a non-spec location, as outlined here (https://exiftool.org/forum/index.php?topic=9262.0), but am interested to learn if there's another reason for it.

I can use -m if necessary to plow through, and have successfully done so, but would rather not need to, just because I'd like to know about any other minor errors that might come up in the future, if any. I'm guessing not, other than to stop using Directory Opus to set these tags, but is there anything else I can do about it?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on June 26, 2020, 04:03:52 PM
Quote from: mazeckenrode on June 26, 2020, 03:45:34 PM
Using this to set the EXIF tags results in:


Error: [minor] IFD0 pointer references previous IFD0 directory


I presume that's due to Directory Opus writing EXIF to a non-spec location, as outlined here (https://exiftool.org/forum/index.php?topic=9262.0), but am interested to learn if there's another reason for it.

Something is writing the EXIF block incorrectly.  See FAQ #20 (https://exiftool.org/faq.html#Q20) for how to fix it (don't use that command on RAW files).  You'll have to experiment to see if DO is the problem or not.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on June 29, 2020, 01:37:11 PM
Quote from: StarGeek on June 12, 2020, 12:04:55 AM
Quote
The JSON files showed a handful of the EXIF tag values preceded by a question mark:
...
No question marks appeared in any tags exported from these updated PNGs, not even in the tags I hadn't changed since the earlier JSON exports. Anybody have an explanation?

No idea without actually seeing a file and figuring out what the actual character was (might not have been a ASCII question mark), but it sounds like Opus is writing something weird.

FYI, I found at least part of the reason for the mystery question marks, and it's basically me. They resulted from me copying and pasting from an ExifTool-exported UTF-8 JSON to a Windows-1252 editing tab in Notepad++ while composing my forum reply at the time. Still, curious as to why only certain EXIF tags (looks like all the unformatted text string tags) contain whatever the actual unicode character(s) is/are. Again, though, these were tags as set by Directory Opus, not by ExifTool.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on July 21, 2020, 01:46:08 PM
Quote from: StarGeek on June 18, 2020, 01:22:48 PM
Try this, replacing TAG with the tag that has the page number data. You can repeat the first part for each tag you need to edit
exiftool "-TAG<${TAG;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;s/\d+(\/\d+)/$temp$1/}" "-DateTimeOriginal-<0:0:${TAG;my $temp=$1 if $self->GetValue('FileName')=~m/0*(\d+)\./;m/\d+\/(\d+)/;$_=$1-$temp}" /path/to/files/

StarGeek, I've successfully used this to update values for various comment/description/subject/title metadata fields — thanks much again for that — but am now attempting to adapt it for use with several keyword metadata fields, in which I include Page1 as a keyword in scans of all pages of whatever document or publication. Based on your code above, I've come up with the following code to attempt replacing Page1 with Page# (# being the page number taken from the filename):


exiftool -m "-IPTC:Keywords<${IPTC:Keywords;my $temp=$1 if $self->GetValue('FileName')=~m/- 0*(\d+).*\./;s/(Page)\d+/$temp$1/}" "-XMP:Subject<${XMP:Subject;my $temp=$1 if $self->GetValue('FileName')=~m/- 0*(\d+).*\./;s/(Page)\d+/$temp$1/}" .


But that's resulting in 2Page, etc., instead of Page2, even though I put (Page) before \d+ in the code. What am I doing wrong?
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: StarGeek on July 21, 2020, 05:22:42 PM
The key part is here
s/(Page)\d+/$temp$1/}

The capture group here is the word Page.  That is place into variable $1$temp is the page number captured from the filename, so the result of $temp$1 is #Page.

Try changing it to $1$temp.
Title: Re: Search & replace in tags, bump time for multiple files sequentially
Post by: mazeckenrode on July 21, 2020, 06:07:04 PM
Quote from: StarGeek on July 21, 2020, 05:22:42 PM
Try changing it to $1$temp.

Yep, that makes more sense now, and working beautifully with the change. Thanks again.