CR/LF in Tags (x0D / x0A)

Started by asjones, October 01, 2015, 10:29:29 AM

Previous topic - Next topic

asjones

I have a set of images that look like some of the meta data has CR / LF or just CR  that is causing issues on my CSV output.

It is confusing as the tags for "Description" and "Caption-Abstract" look to contain the same description, but the CR/LF only shows up in "Description" but it looks like there is a CR (no LF) in "Caption-Abstract".  I am only seeing this issue in the CSV output not standard text.

If i just run  ExifTool <image>
the text output looks fine and just wraps to the command line screen, but no wrapping in the middle of the field (that I can tell).

If I do this
exiftool -csv bo201111-049.jpg > test1.txt
then the text output is wrapped messing up any CSV import to another system.

When I review the CSV export in Hex mode and search for x0D x0A i find it in the "description", but if i just search for x0D i find it in the same place in the "Caption-Abstract".  (in general one would not want either a CR/LF or just a CR).

In some last minute testing it does look like the CR/LF  and CR issue is also in the -XMLFORMAT. 

Is this a bug or is there something I am missing? 

I can email the image if needed (what address is best?).

any help would be appreciated.

thanks

Alan






Phil Harvey

Hi Alan,

In the normal output, ExifTool will convert special characters to dots (periods).  But in CSV output, all special characters are preserved, and the string is quoted if necessary as per the CSV standard.  Linefeeds and carriage returns are valid in CSV strings when they are properly quoted.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

asjones

Phil,

Thanks for the fast reply.  I was afraid you were going to say something like that.  Unfortunately CRs or CR/LFs are causing me input troubles.  Not sure how to fix that cleanly.  I am surprised nobody else has had the issue.  I wish ExifTool could strip those out on demand. :)

Well I guess back to the drawing board :(

thanks

Alan


Phil Harvey

Hi Alan,

If you are outputting a fixed set of tags, you can use -p to get the output you want, and the advanced formatting feature to do any necessary filtering.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

asjones

Following up to my own post in case it helps others.... I have a work around for now.... I realized an old tool called CSVfix I have used before might fix the issue.

CSVfix is a great tool at manipulating CSV files (it understands commas and quote matching etc....
http://neilb.bitbucket.org/csvfix/

Using CSVfix with the rmnew command fixed the issue. http://neilb.bitbucket.org/csvfix/manual/csvfix16/csvfix.html?rmnew.html

Example
"Joe Public", "101 Somwhere St
Anytown
USA"

csvfix rmnew -s ',' addresses.csv
gives this
"Joe Public", "101 Somwhere St, Anytown, USA"

hope this helps someone else or other parts of the CSVfix tool might help someone... it is a nice tool

asjones

Hate to post this, but would love for you to consider a a -csv-clean type option that removes the CR and/or LF out of individual tags.  It would make things cleaner and I wonder if others would benefit as well.

my workaround works, but it is a pain :)

(i mentioned it as CR and/or LF as i know some systems do one or the other and not always both).

thanks

Alan



Phil Harvey

Hi Alan,

I don't like the idea of a csv-specific filter.  But maybe if I added a general filter to allow you to filter globally with an arbitrary expression...  Let me think about this.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Hi Alan,

The next version of ExifTool (10.05) will have a new API Filter option that will allow you to convert newlines by adding this to your command:  -api "Filter=s/[\n\r]+/, /g"

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

asjones

wow, thanks for the update..... the filter option sounds like it could be very powerful for all sorts of things.  I still get confused reading regular  expressions and the layout you have.  It will be nice to see the documentation for options.

thanks!!!

Alan

StarGeek

I learned a lot about regular expressions at Regular-Expressions.info.  I use RexEx101.com to test expressions I'm not sure about. 
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

asjones

StarGeek,

Great sites, especially the RegEx101.  Not seen it before.

thanks!

Alan

Phil Harvey

StarGeek:  RegEx101.com is really cool.   I haven't seen anything like that before.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

ExifTool 10.05 with the new API Filter option is now available.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).