ExifTool Forum

ExifTool => The "exiftool" Application => Topic started by: TopSolid on April 27, 2011, 03:17:07 AM

Title: .docx thumbnail extraction
Post by: TopSolid on April 27, 2011, 03:17:07 AM
Hi,

For the ResourceSpace application, I try to use exiftool to extract embedded thumbnail of office documents, .docx, .doc, etc.
With this command line on windows, it works for jpg image:

exiftool.exe -b –previewimage 1.jpg > image_exif.jpg

But it doesn't with .docx:
exiftool.exe -b –previewimage 1.docx > image_exif.jpg

However with exiftoolGUI, I can see the thumbnail of the .docx, so it should work.
I know there's something about zip feature in this case, but I don't know how.

Anyone has an idea?
I attach a sample of Word file for testing

Thanks in advance!



Title: Re: .docx thumbnail extraction
Post by: TopSolid on April 27, 2011, 12:31:35 PM
I have installed complete version in Perl. Reading the source code, exiftool does not manage wmf format, thumbnail of docx format when unzipping docx. It manages only jpeg and png bitmaps... Any ideas guys?
Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on April 27, 2011, 08:16:05 PM
ExifToolGUI uses windows routines to display the preview.  ExifTool only extracts embedded JPEG previews, and I don't think DOCX files have one.

- Phil
Title: Re: .docx thumbnail extraction
Post by: TopSolid on April 28, 2011, 08:48:58 AM
Hi Phil,

Thanks for your answer.
In unzipped docx, preview file is saved as wmf format in Docprops/thumbnail.wmf. Any plan to support this format in the next version of exiftool?
It would be fantastic.

Rgds

David

Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on April 28, 2011, 08:50:16 PM
Hi David,

Thanks for the suggestion, I'll look into this when I get back after my vacation (in 2 weeks or so).

- Phil
Title: Re: .docx thumbnail extraction
Post by: TopSolid on May 06, 2011, 06:04:23 PM
Thanks! :)

Keep me posted and have a nice vacation.

Rgds

David
Title: Re: .docx thumbnail extraction
Post by: TopSolid on June 06, 2011, 06:05:48 AM
Hello Phil,

Any news about that wmf support by exiftool?

Thanks

David
Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on June 06, 2011, 07:54:28 AM
I put that on my list but I hadn't done anything about it yet.  Sometimes I need a little prompting.  Thanks.

Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on June 06, 2011, 11:43:00 AM
Adding the ability to extract the WMF thumbnail looks easy but all of my DOCX samples use a JPEG thumbnail.  Could you email me one with a WMF thumbnail for testing?  My mail is philharvey66 at gmail.com

- Phil
Title: Re: .docx thumbnail extraction
Post by: TopSolid on June 08, 2011, 11:01:12 AM
Hi Phil,

I just sent you docx files. What Office version did you used (2003/2007/2010)? I have Office 2007 on Windows 7 x64. It is always wmf format for Word, jpeg for pptx and xlsx.

Thanks!
Title: Re: .docx thumbnail extraction
Post by: TopSolid on June 08, 2011, 11:03:38 AM
There is also a docx attached to my first post...

Keep me posted!

Thanks
Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on June 08, 2011, 12:35:25 PM
Great, thanks.  The more samples the better.  There are various flavours of MXF files, and I'd like to see if I can extract some information from them too.

I was using MS Office for the Mac.

- Phil
Title: Re: .docx thumbnail extraction
Post by: TopSolid on June 12, 2011, 11:45:46 AM
Hi Phil,

I have seen in latest release 8.59 that  wmf docx thumbnail is now supported, great news!  :)
So I try to test it and resulting jpg file is always empty.

I have tried this:
exiftool.exe -b –previewimage 1.docx > 1.jpg

and this:
exiftool.exe -b –ThumbnailImage 1.docx > 1.jpg

I attach docx file.

Any idea?

David
Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on June 12, 2011, 08:44:40 PM
Hi David

Try -previewmf
Title: Re: .docx thumbnail extraction
Post by: TopSolid on June 13, 2011, 03:34:08 AM
Phil,

Thanks, it works with this command line:
exiftool -b -previewwmf 1.docx > 1.wmf
Result is always the original wmf file, not a jpg file.
But I was expecting exiftool to convert wmf file to jpg.

When you announce "Recognize WMF images", is something exiftool could support without any graphic tool box?

Thanks!
Title: Re: .docx thumbnail extraction
Post by: Phil Harvey on June 13, 2011, 07:14:55 AM
ExifTool processes metadata only.  It doesn't do image manipulations.

There isn't any metadata in WMF images, so all it can do is recognize the file type if you throw one of these at ExifTool.

- Phil