Character encoding problem: WLPG-Drupal

Started by TheoRichel, February 05, 2016, 09:17:03 AM

Previous topic - Next topic

TheoRichel

I have imported a photocollection that was tagged in Windows Live Photo Gallery into the Drupal CMS. It works, but closer inspection shows that the tags are witten like this: <code>N�o�o�r� �R�i�c�h�e�l</code>. Removing the question mark characters renders the tag lame, apparently these question marks (or what they stand for) are part of the original. So I cannot edit them. How do I change them back to a more readable/editable presentation. I suppose I should run a conversion with ExifTool over the photocollection, but if that is so, what should Exiftools search for and what should it change the data to. And: is this a known problem with Windows Live Photo Gallery?

Many thanks

Phil Harvey

Can you email me a sample so I can take a look (philharvey66 at gmail.com).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TheoRichel

Separately I have mailed you the login for the photosite. Try for instance: http://theorichel.nl/node/1123 . The tag here should be Aaf, but it is interpreted as <code>A�a�f</code>. I will also send you the jpg itself (my local, not uploaded version).

Thanks

Phil Harvey

I'm a bit at a loss.  The XMP in the original image you sent contains a Subject of "Aaf".  I don't see any funny characters when I extract tags with ExifTool, or when I look at the XMP directly.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TheoRichel

XMP? This is supposed to be an exif-tag. In Drupal I get the message that the XMP-library is not loaded and that therefore no xmp tags are extracted. Well they may be there nevertheless of course, but the file never went through any Adobe product anyway. Also this Aaf is extracted through 'Keywords' and not through 'Subject'.

Phil Harvey

#5
There is also an EXIF XPKeywords tag which stores the value "Aaf".  This is stored with a 2-byte Unicode encoding, so in binary it looks like zero bytes between ASCII characters.  I wonder if this is the problem.  If so, whatever you are using to read this tag isn't decoding it properly.  With exiftool -v3 you should see this:

  | 10) XPKeywords = Aaf
  |     - Tag 0x9c9e (8 bytes, int8u[8] read as undef[8]):
  |         11a6: 41 00 61 00 66 00 00 00                         [A.a.f...]


- Phil

Edit: I takes -v3, not just -v
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TheoRichel

I'll get back to the person who coded the module to import this into Drupal.