Office Doc Binary Metadata bug?

Started by secshoggoth, February 20, 2018, 07:55:02 PM

Previous topic - Next topic

secshoggoth

I think I found a bug in the way that exiftool extracts binary metadata from office documents.

I was analyzing a malicious composite format word document and the title metadata contained a mix of ascii and binary data that malware decodes to download its second stage. The analysis can be found here.

I noticed that exiftool was converting characters when extracting the title metadata. I tried it with multiple options, including the -binary option, and found that certain characters/values it would convert from one byte to 2 bytes. You can see this in the image below, where I show one instance, but it happened multiple times in the extraction.



I tried both the version of exiftool that came with in REMNux (9.x) and the latest version from the website with the same results.

The document can be downloaded from https ://drive.google.com/ open?id=1gLgXDVRqdK-VifZ5iE8rHdkN5tizq6Mt.

NOTE THIS IS MALWARE! BE CAREFUL!!!

Let me know if this is a bug, or I am doing something wrong in running exiftool. Thanks!

PH Edit: split malware URL to avoid accidental downloading

Phil Harvey

ExifTool is attempting to convert the output text to UTF-8.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

secshoggoth

Is there a way to prevent that? I couldn't find the option to give me raw output for this.

Phil Harvey

The only way short of removing all of the calls to Decode() in lib/Image/ExifTool/FlashPix.pm would be to use the -v4 output and convert the output hex characters back to binary.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).