Malformed UTF-8 character(s)

Started by newoski, May 30, 2020, 01:51:23 PM

Previous topic - Next topic

newoski

99% there on my batch tagging project, thanks to this helpful forum. Final issue is a warning about "Malformed UTF-8 character(s)". Below are a few example clumps of text. Any guidance on what it means or how to solve? Not even sure if it's an issue or not

scanpix/Images/Geordie Shore/URA2Aw9QRjw.jpg   URA2Aw9QRjw.jpg   11.99   URA2Aw9QRjw.jpg   scanpix/Images/Geordie Shore   107 kB   2020:05:29 06:37:19-04:00   2020:05:30 06:19:38-04:00   2020:05:30 06:19:38-04:00   rw-rw-rw-   JPEG   jpg   image/jpeg   634   1024   Baseline DCT, Huffman coding   8   3   YCbCr4:2:0 (2 2)   634x1024   0.649   Backgrid UK   Ebum      London   United Kingdom   , Chloe Ferry, Nicole Bass, Makeup, Pink Lipstick, Eye Makeup, Black Dress, Sleeveless, Spaghetti Straps, Two Toned Dress, Black, Pink, Printed Dress, Floral Print, Off Shoulder, Ruched, Pattern, Mini Dress, Brown Handbag, Louis Vuitton, Clear Shoes, Open Toe, Ankle Strap, Sandal, Black Handbag, Lbd, Little Black Dress, Black Shoe, Stiletto Heels, Smile, Funny, Funny Face, Gesture   BGUK_1667027 - London, UNITED KINGDOM  - Celebrities spotted outside Libertine Night Club while attending In The Style's Summer party  Pictured: Chloe Ferry, Nicole Bass  BACKGRID UK 25 JULY 2019   BYLINE MUST READ: NIGHTVISION / BACKGRID  UK: +44 208 344 2007 / uksales@backgrid.com  USA: +1 310 798 9111 / usasales@backgrid.com  *UK Clients - Pictures Containing Children Please Pixelate Face Prior To Publication*



scanpix/Images/Geordie Shore/_POhLz9_6Xs.jpg   _POhLz9_6Xs.jpg   11.99   _POhLz9_6Xs.jpg   scanpix/Images/Geordie Shore   145 kB   2020:05:29 06:09:07-04:00   2020:05:30 06:19:43-04:00   2020:05:30 06:19:43-04:00   rw-rw-rw-   JPEG   jpg   image/jpeg   1024   1018   Baseline DCT, Huffman coding   8   3   YCbCr4:2:0 (2 2)   1024x1018   1   Backgrid UK   Joml      London   United Kingdom   , Marnie Simpson, Makeup, Pink Lipstick, Short Necklace, Pendant, Multicoloured Shirt, Open Collar, Pocket Detail, Coudray Fabric, Pleated, Pattern, Black Jeans, Denim Black Boots, Suede Boots, Ankle Length Boots, Smile, Funny, Full Length, Marnie Simpson, Makeup, Pink Lipstick, Short Necklace, Pendant   BGUK_1888994 - London, UNITED KINGDOM  - English television personality Marnie Simpson pictured during the campaign for Department of Education's Hungry Little Mind to encourage parents to interact with their children more in London.  Pictured: Marnie Simpson  BACKGRID UK 3 MARCH 2020   BYLINE MUST READ: CONNECTED PHOTOGRAPHY / BACKGRID  UK: +44 208 344 2007 / uksales@backgrid.com  USA: +1 310 798 9111 / usasales@backgrid.com  *UK Clients - Pictures Containing Children Please Pixelate Face Prior To Publication*

StarGeek

I'm assuming Windows?

You will find that the data might either be truncated or corrupt at the part that says The Style's Summer party.  This is because of the Fancy Quote '.  It has to do with the character coding in the Windows command line (see FAQ #18).

Take a look at the file with
exiftool -g1 -a -s /path/to/file/
to see if there's a problem.

You can try changing the code page with cpcp 65001 and changing the font as mentioned in that FAQ.  Another option, the one I tend to use, is to add the -L (latin) option.  That seems to fix most of my character encoding problems, though it won't work if you have to deal with Cyrillic or Chinese or similar characters.


* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).