Convert text metadata to number direct in command line?

Started by TushyBhutt, March 30, 2023, 12:08:39 PM

Previous topic - Next topic

TushyBhutt

Hi,

I am still working on my Stable Diffusion PNG Info to Lightroom converter and have some questions on numerical values. 

Right now, I can pull a string like this "Score: 6.71" from "PNG:Parameters" and pop it into "XMP-lr:HierarchicalSubject|Scoring|6.71".  The thing is, Lightroom seems to "sort" those based on text length, so 6.53 would rank higher than 6.7

I saw this thread about calculating tags values and it refers to a config file:

Calculate with a tag

So, questions:

1. My guess is the tags are stored as text by default, so "6.53" or "6.71" would need to convert to a numerical value?

2. Can this be done via the command line like :

-sep "^^" "-XMP-lr:HierarchicalSubject<${PNG:parameters; s/^(?!Upscaled )Score: (\d).(\d{1,})/Scoring|Aesthetic Scorer|$1|$1.$2/gm; -p "$val = $val =~ /Scoring|Aesthetic Scorer|(\d)|(\d+(?:\.\d+)?)/ ? $1 + 0 : ''; $_ = $val"}"
3. Or does it have to be a config file? Can a commanad line call the config file and an args file?

4. Speaking of "sep", if that's used in an ARGS file, "sep" can't use "##" right, because a "#" is seen as a comment?

StarGeek

Quote from: TushyBhutt on March 30, 2023, 12:08:39 PM1. My guess is the tags are stored as text by default, so "6.53" or "6.71" would need to convert to a numerical value?

Perl variables are based upon the context of how they are used.  For example, the following Perl script sets $x as a string but $y as numerical.  The first print statement prints them out as such, removing the unneeded 0s in $y.  But when you add 1 to both, $x now shows the reduced numerical value.
C:\Programs\My_Stuff>type temp.pl
my $x="3.5000";
my $y=4.50000;
print "$x $y\n";
$x+=1;
$y+=1;
print "$x $y\n";
C:\Programs\My_Stuff>temp.pl
3.5000 4.5
4.5 5.5

Quote2. Can this be done via the command line like :

I would suggest picking a length and then padding/truncating to that length.  But that might be complicated if it is possible to have more leading digits.  For example, how does 10.5 sort compared to 9.5?  Looking at your example command, it looks like you're only capturing a single digit before the decimal?

A config would be cleaner, but it's doable on the command line

Since you're already grabbing the numbers before and after the decimal separately, you might try something like this. I'm using Match instead of substitution for ease of processing the variables
${TAG;m/Score: (\d).(\d+)/;$_='Scoring|Aesthetic Scorer|'.$1.'|'.$1.'.'.$2.0 x(3-length $2)}

Quicky example, though you'll have to edit to fit your need
C:\>exiftool -G1 -a -s -Description y:\!temp\Test3.jpg y:\!temp\Test4.jpg
======== y:/!temp/Test3.jpg
[XMP-dc]        Description                     : Score: 6.53
======== y:/!temp/Test4.jpg
[XMP-dc]        Description                     : Score: 6.7
    2 image files read

C:\>exiftool -p "${Description;m/Score: (\d).(\d+)/;$_='Scoring|Aesthetic Scorer|'.$1.'|'.$1.'.'.$2.0 x(3-length $2)}" y:\!temp\Test3.jpg y:\!temp\Test4.jpg
Scoring|Aesthetic Scorer|6|6.530
Scoring|Aesthetic Scorer|6|6.700

QuoteCan a commanad line call the config file and an args file?

The -Config option must be the first thing in the command, but otherwise, yes, you can use a config file and an args file (or multiple args files).

Quote4. Speaking of "sep", if that's used in an ARGS file, "sep" can't use "##" right, because a "#" is seen as a comment?

Yes, that's correct, though I beleive you can work around it with #[CSTR] (see -@ (Argfile) option).  You can use anything you want as the separator as long as it's not part of the data, i.e. -sep supercalifragilisticexpialidocious would separate on "supercalifragilisticexpialidocious".  I use ## as is an unlikely combo.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

TushyBhutt

Thank you for the detailed response!  See below for some comments/follow up:

1.  If PERL goes by context for variables what is the default context when ExifTool encounters something like "Score: 7.5" and that's split into "Score|7.5" ? Is it all still text until some operation is done on it? In your example, $Y is numeric because the value doesn't have a quote around it, correct? 

In your Regex example, are $1 and $2 converted to numeric values because the
x(3-length $2)} is performing a math function on the length of the string?

2. The max a score could be is 10.00 (or 10.0 depending on which aesthetic scorer was used).  I'm capturing the digit ahead of the decimal to act as a parent tag in LR otherwise you could have 1000 values to scroll through

3. Thanks for the note on Config and the SEP characters.  I'll stick with the double-carat because that's not used anywhere in Stable Diffusion or ExifTool that I've run into yet

StarGeek

Quote from: TushyBhutt on March 30, 2023, 05:25:47 PM1.  If PERL goes by context for variables what is the default context when ExifTool encounters something like "Score: 7.5" and that's split into "Score|7.5" ? Is it all still text until some operation is done on it?

I'm not sure what you mean by split into "Score|7.5".  It's text until you perform a numeric operation.  I had previously thought for some reason that Perl would drop all the non-numeric, but a quick test shows that adding 1 to the above string produces a value of 1.

QuoteIn your example, $Y is numeric because the value doesn't have a quote around it, correct?

QuoteIn your Regex example, are $1 and $2 converted to numeric values because the
x(3-length $2)} is performing a math function on the length of the string?

x(3-length $2)}[/tt] isn't a math function, it's the Perl Repetition Operator.  Math is used to figure out how long the $2 capture is and that is used to repeat the 0s that many times.  That part might be clearer as
$2 . "0"x(3-length($2) )
Breaking down
First capture group 2 $2
concatenate operator, the dot,
then for the rest, you would take the length of $2 length($2), subtract that from 3 3-length($2), and then repeat x the previous string 0 that many times.

Quote3. Thanks for the note on Config and the SEP characters.  I'll stick with the double-carat because that's not used anywhere in Stable Diffusion or ExifTool that I've run into yet

The caret is an escape character on Windows in either CMD or PS, can't remember which atm. But quotes around it will treat it as listed.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

TushyBhutt

#4
Quote from: StarGeek on March 30, 2023, 07:12:47 PM
Quote from: TushyBhutt on March 30, 2023, 05:25:47 PM1.  If PERL goes by context for variables what is the default context when ExifTool encounters something like "Score: 7.5" and that's split into "Score|7.5" ? Is it all still text until some operation is done on it?

QuoteI'm not sure what you mean by split into "Score|7.5".  It's text until you perform a numeric operation.  I had previously thought for some reason that Perl would drop all the non-numeric, but a quick test shows that adding 1 to the above string produces a value of 1.
By 'split', I mean that Lightroom will create an hierarchical tag set from "Score:7.5", making "Score" the parent tag and "7.5" the value.  As of now, that value is stored as text so sorting is based on tag length. 

So, just trying to figure out if this would work at keeping "Score" and its value as two tags because Lightroom treats pipes as hierarchical separators (forgive me for mangling the format, I'm using Windows):

-sep "^^"
"-XMP-lr:HierarchicalSubject<${PNG:parameters;
m/Score: (\d).(\d+)/gm;
$_='Score|'.$1.'|'.$1.'.'.$2.0 x(3-length $2)}"

Also, what is the purpose of the '.$1.' ?  My regex code simply uses $1 ... is that incorrect?

Quotex(3-length $2)}[/tt] isn't a math function, it's the Perl Repetition Operator.  Math is used to figure out how long the $2 capture is and that is used to repeat the 0s that many times.

Seems as long as math is used somewhere in the search/replace, then a value becomes numeric :)  I'll tootle around on Google for some examples.

QuoteThe caret is an escape character on Windows in either CMD or PS, can't remember which atm. But quotes around it will treat it as listed.


OK good to know it won't interfere as long as the quotes are there


Phil Harvey

In Perl you can treat variables as either numbers or text.  There are different operators if you want to treat them differently (eg. "." for numerical multiplication and "x" for string/array duplication).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TushyBhutt

I figure just converting the text to numeric will be a big victory; this whole project has taken on a life of its own

Phil Harvey

#7
There is usually no need to convert a number to text text to number in Perl.  But if you really want to, you can do this:

$var = $var + 0;

After this, $var will be primarily numeric unless it couldn't be converted to a number.

- Phil

Edit: example was text to number.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TushyBhutt

Quote from: Phil Harvey on April 02, 2023, 07:35:17 AMThere is usually no need to convert a number to text in Perl.  But if you really want to, you can do this:

$var = $var + 0;

After this, $var will be primarily numeric unless it couldn't be converted to a number.

- Phil

I'm doing the opposite :)

Otherwise Lightroom seems to think 8.7  is lower in value than 8.532 because it has a smaller length